Group Normalization
E701502
Group Normalization is a neural network normalization technique that divides channels into groups and normalizes within each group to stabilize training, especially effective for small batch sizes.
Statements (48)
| Predicate | Object |
|---|---|
| instanceOf |
deep learning method
ⓘ
neural network normalization technique ⓘ |
| advantage |
does not require running estimates of statistics
ⓘ
independent of batch dimension ⓘ more stable training in memory-constrained settings ⓘ performance is less sensitive to batch size ⓘ works well with very small batch sizes ⓘ |
| advantageOver | Batch Normalization NERFINISHED ⓘ |
| appliesTo |
convolutional neural networks
ⓘ
feedforward neural networks ⓘ sequence models ⓘ vision models ⓘ |
| citationYear | 2018 ⓘ |
| comparedTo |
Batch Normalization
NERFINISHED
ⓘ
Instance Normalization ⓘ Layer Normalization NERFINISHED ⓘ |
| describedIn | Group Normalization (2018) paper NERFINISHED ⓘ |
| doesNotDependOn | batch dimension statistics ⓘ |
| field |
computer vision
ⓘ
deep learning ⓘ |
| goal |
improve optimization of deep networks
ⓘ
reduce internal covariate shift ⓘ |
| hyperparameter | group size ⓘ |
| implementationDetail |
groups are formed by splitting channels along the channel dimension
ⓘ
mean and variance are computed over spatial dimensions and group channels ⓘ |
| includesParameter |
learnable scale (gamma)
ⓘ
learnable shift (beta) ⓘ |
| introducedBy |
Kaiming He
NERFINISHED
ⓘ
Yuxin Wu NERFINISHED ⓘ |
| keyIdea | divides channels into groups and normalizes within each group ⓘ |
| motivation |
reduce dependence on batch size
ⓘ
stabilize training for small batch sizes ⓘ |
| normalizationAxis | channel groups ⓘ |
| normalizes | activations within each group ⓘ |
| oftenUsedWith |
ResNet architectures
NERFINISHED
ⓘ
convolutional layers ⓘ object detection models ⓘ segmentation models ⓘ |
| operatesOn | feature channels ⓘ |
| publishedAt | ECCV 2018 NERFINISHED ⓘ |
| relatedConcept |
Batch Normalization
NERFINISHED
ⓘ
Instance Normalization NERFINISHED ⓘ Layer Normalization NERFINISHED ⓘ |
| typicalSetting |
detection and segmentation tasks with large images
ⓘ
small-batch training on GPUs ⓘ |
| usesParameter | number of groups ⓘ |
| usesStatistics |
per-group mean
ⓘ
per-group variance ⓘ |
Referenced by (1)
Full triples — surface form annotated when it differs from this entity's canonical label.