decthings

LayerNorm

This node normalizes over features so that the mean is close to zero and standard deviation is close to one. The number of dimensions to normalize over can be configured. That is, given an input of shape "(C, X, Y)", and with <Final dimensions> set to two, the normalization occurs over for each feature in the dimension C.

Additionally, learnable weight \(\gamma\) and bias \(\beta\) are applied after normalization. For an input of shape (C, X, Y), and with <Final dimensions> set to two, these parameters have size (X, Y), meaning they operate per feature of dimension C. The mathematical expression is:

\(y = \cfrac{x - E[x]}{\sqrt{V[x] + \epsilon}} * \gamma + \beta\)
Where \(E[x]\) is the mean, \(V[x]\) is the variance (standard deviation squared) and \(\epsilon\) is a small configurable constant.

Layer normalization is described in the paper Layer Normalization. It is shown that normalizing inputs can significantly reduce training time, and can also improve model performance. Layer normalization is especially efficient for recurrent networks.

The only supported input data type is Float32, and the input must have at least <Final dimensions> dimensions. In order to determine the sizes of the weight and bias parameters, the sizes of the last <Final dimensions> dimensions must be known (cannot be ?).

The output will have the same shape and data type as the input.

By clicking the node the following parameters can be configured on the right panel:

  • Epsilon: A small constant added to the variance to avoid dividing by zero.
  • Final dimensions: Number of dimensions to normalize over, starting from the end of the input.
  • Number of connections: Increases the number of connections on this node. The first input will lead to the first output, the second input to the second output and so on. This is useful because even though the connections are separate they share the same learned parameters, i.e the learned mean and standard deviation.
  • Weight initializer: Initial value for the weight. If unsure, use a value of 1.
  • Bias initializer: Initial value for the bias. If unsure, use a value of 0.

Product

  • Documentation
  • Pricing
  • API reference
  • Guides

Company

  • Support

Get going!

Sign up
  • Terms and conditions
  • Privacy policy
  • Cookie policy
  • GitHub
  • LinkedIn

This website uses cookies to enhance the experience.

Learn more