defxavier_uniform_(tensor, gain=1.): # type: (Tensor, float) -> Tensor r"""Fills the input `Tensor` with values according to the method described in `Understanding the difficulty of training deep feedforward neural networks` - Glorot, X. & Bengio, Y. (2010), using a uniform distribution. The resulting tensor will have values sampled from :math:`\mathcal{U}(-a, a)` where .. math:: a = \text{gain} \times \sqrt{\frac{6}{\text{fan\_in} + \text{fan\_out}}} Also known as Glorot initialization. Args: tensor: an n-dimensional `torch.Tensor` gain: an optional scaling factor Examples: >>> w = torch.empty(3, 5) >>> nn.init.xavier_uniform_(w, gain=nn.init.calculate_gain('relu')) """ fan_in, fan_out = _calculate_fan_in_and_fan_out(tensor) std = gain * math.sqrt(2.0 / float(fan_in + fan_out)) a = math.sqrt(3.0) * std # Calculate uniform bounds from standard deviation
return _no_grad_uniform_(tensor, -a, a)
所以xavier_uniform_就是在$-gain \times \sqrt{3} \times \sqrt{\frac{2}{fan_{in}+fan_{out}}}, gain \times \sqrt{3} \times \sqrt{\frac{2}{fan_{in}+fan_{out}}}$范围内取值。