Group Normalization

E701502

Group Normalization is a neural network normalization technique that divides channels into groups and normalizes within each group to stabilize training, especially effective for small batch sizes.

Try in SPARQL Jump to: Statements Referenced by

Statements (48)

Predicate Object
instanceOf deep learning method
neural network normalization technique
advantage does not require running estimates of statistics
independent of batch dimension
more stable training in memory-constrained settings
performance is less sensitive to batch size
works well with very small batch sizes
advantageOver Batch Normalization NERFINISHED
appliesTo convolutional neural networks
feedforward neural networks
sequence models
vision models
citationYear 2018
comparedTo Batch Normalization NERFINISHED
Instance Normalization
Layer Normalization NERFINISHED
describedIn Group Normalization (2018) paper NERFINISHED
doesNotDependOn batch dimension statistics
field computer vision
deep learning
goal improve optimization of deep networks
reduce internal covariate shift
hyperparameter group size
implementationDetail groups are formed by splitting channels along the channel dimension
mean and variance are computed over spatial dimensions and group channels
includesParameter learnable scale (gamma)
learnable shift (beta)
introducedBy Kaiming He NERFINISHED
Yuxin Wu NERFINISHED
keyIdea divides channels into groups and normalizes within each group
motivation reduce dependence on batch size
stabilize training for small batch sizes
normalizationAxis channel groups
normalizes activations within each group
oftenUsedWith ResNet architectures NERFINISHED
convolutional layers
object detection models
segmentation models
operatesOn feature channels
publishedAt ECCV 2018 NERFINISHED
relatedConcept Batch Normalization NERFINISHED
Instance Normalization NERFINISHED
Layer Normalization NERFINISHED
typicalSetting detection and segmentation tasks with large images
small-batch training on GPUs
usesParameter number of groups
usesStatistics per-group mean
per-group variance

Referenced by (1)

Full triples — surface form annotated when it differs from this entity's canonical label.

Layer Normalization relatedTo Group Normalization