Change to the relu initialization recommended by TensorFlow.
In the TensorFlow tutorial, relus are initialized by truncated Gaussians with
small offsets to avoid them being initialized with zero activation.
Before we had them initialized just with the Gaussian centered around 0.
This initialization leads to much faster convergence.