Main Content

groupnorm

Normalize data across grouped subsets of channels for each observation independently

    Description

    The group normalization operation normalizes the input data across grouped subsets of channels for each observation independently. To speed up training of the convolutional neural network and reduce the sensitivity to network initialization, use group normalization between convolution and nonlinear operations such as relu.

    After normalization, the operation shifts the input by a learnable offset β and scales it by a learnable scale factor γ.

    The groupnorm function applies the group normalization operation to dlarray data. Using dlarray objects makes working with high dimensional data easier by allowing you to label the dimensions. For example, you can label which dimensions correspond to spatial, time, channel, and batch dimensions using the "S", "T", "C", and "B" labels, respectively. For unspecified and other dimensions, use the "U" label. For dlarray object functions that operate over particular dimensions, you can specify the dimension labels by formatting the dlarray object directly, or by using the DataFormat option.

    Note

    To apply group normalization within a layerGraph object or Layer array, use groupNormalizationLayer.

    example

    dlY = groupnorm(dlX,numGroups,offset,scaleFactor) applies the group normalization operation to the input data dlX using the specified number of groups and transforms it using the specified offset and scale factor.

    The function normalizes over grouped subsets of the 'C' (channel) dimension and the 'S' (spatial), 'T' (time), and 'U' (unspecified) dimensions of dlX for each observation in the 'B' (batch) dimension, independently.

    For unformatted input data, use the 'DataFormat' option.

    example

    dlY = groupnorm(dlX,numGroups,offset,scaleFactor,'DataFormat',FMT) applies the group normalization operation to the unformatted dlarray object dlX with format specified by FMT. The output dlY is an unformatted dlarray object with dimensions in the same order as dlX. For example, 'DataFormat','SSCB' specifies data for 2-D image input with format 'SSCB' (spatial, spatial, channel, batch).

    example

    dlY = groupnorm(___Name,Value) specifies options using one or more name-value arguments in addition to the input arguments in previous syntaxes. For example, 'Epsilon',3e-5 sets the variance offset to 3e-5.

    Examples

    collapse all

    Use groupnorm to normalize input data across channel groups.

    Create the input data as a single observation of random values with a height and width of four and six channels.

    height = 4;
    width = 4;
    channels = 6;
    observations = 1;
    
    X = rand(height,width,channels,observations);
    dlX = dlarray(X,'SSCB');

    Create the learnable parameters.

    offset = zeros(channels,1);
    scaleFactor = ones(channels,1);

    Compute the group normalization. Divide the input into three groups of two channels each.

    numGroups = 3;
    dlY = groupnorm(dlX,numGroups,offset,scaleFactor);
    

    Input Arguments

    collapse all

    Input data, specified as a formatted dlarray, an unformatted dlarray, or a numeric array.

    If dlX is an unformatted dlarray or a numeric array, then you must specify the format using the 'DataFormat' option. If dlX is a numeric array, then either scaleFactor or offset must be a dlarray object.

    dlX must have a 'C' (channel) dimension.

    Number of channel groups to normalize across, specified as a positive integer, 'all-channels', or 'channel-wise'.

    numGroupsDescription
    positive integerDivide the incoming channels into the specified number of groups. The specified number of groups must divide the number of channels of the input data exactly.
    'all-channels'Group all incoming channels into a single group. The input data is normalized across all channels. This operation is also known as layer normalization. Alternatively, use layernorm.
    'channel-wise'Treat all incoming channels as separate groups. This operation is also known as instance normalization. Alternatively, use instancenorm.

    Data Types: single | double | char | string

    Offset β, specified as a formatted dlarray, an unformatted dlarray, or a numeric array with one nonsingleton dimension with size matching the size of the 'C' (channel) dimension of the input dlX.

    If offset is a formatted dlarray object, then the nonsingleton dimension must have label 'C' (channel).

    Scale factor γ, specified as a formatted dlarray, an unformatted dlarray, or a numeric array with one nonsingleton dimension with size matching the size of the 'C' (channel) dimension of the input dlX.

    If scaleFactor is a formatted dlarray object, then the nonsingleton dimension must have label 'C' (channel).

    Name-Value Arguments

    Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

    Example: 'Epsilon',3e-5 sets the variance offset to 3e-5.

    Dimension order of unformatted input data, specified as a character vector or string scalar FMT that provides a label for each dimension of the data.

    When you specify the format of a dlarray object, each character provides a label for each dimension of the data and must be one of the following:

    • "S" — Spatial

    • "C" — Channel

    • "B" — Batch (for example, samples and observations)

    • "T" — Time (for example, time steps of sequences)

    • "U" — Unspecified

    You can specify multiple dimensions labeled "S" or "U". You can use the labels "C", "B", and "T" at most once.

    You must specify DataFormat when the input data is not a formatted dlarray.

    Data Types: char | string

    Variance offset for preventing divide-by-zero errors, specified as the comma-separated pair consisting of 'Epsilon' and a numeric scalar greater than or equal to 1e-5.

    Data Types: single | double

    Output Arguments

    collapse all

    Normalized data, returned as a dlarray. The output dlY has the same underlying data type as the input dlX.

    If the input data dlX is a formatted dlarray, dlY has the same dimension labels as dlX. If the input data is not a formatted dlarray, dlY is an unformatted dlarray with the same dimension order as the input data.

    Algorithms

    The group normalization operation normalizes the elements xi of the input by first calculating the mean μG and variance σG2 over spatial, time, and grouped subsets of the channel dimensions for each observation independently. Then, it calculates the normalized activations as

    x^i=xiμGσG2+ε,

    where ϵ is a constant that improves numerical stability when the variance is very small. To allow for the possibility that inputs with zero mean and unit variance are not optimal for the operations that follow group normalization, the group normalization operation further shifts and scales the activations using the transformation

    yi=γx^i+β,

    where the offset β and scale factor γ are learnable parameters that are updated during network training.

    References

    [1] Wu, Yuxin, and Kaiming He. “Group Normalization.” Preprint submitted June 11, 2018. https://arxiv.org/abs/1803.08494.

    Extended Capabilities

    Introduced in R2020b