Group Normalization

E701502

deep learning method neural network normalization technique

Group Normalization is a neural network normalization technique that divides channels into groups and normalizes within each group to stabilize training, especially effective for small batch sizes.

Try in SPARQL Jump to: Statements Referenced by

Statements (48)

Predicate	Object
instanceOf	deep learning method ⓘ neural network normalization technique ⓘ
advantage	does not require running estimates of statistics ⓘ independent of batch dimension ⓘ more stable training in memory-constrained settings ⓘ performance is less sensitive to batch size ⓘ works well with very small batch sizes ⓘ
advantageOver	Batch Normalization NERFINISHED ⓘ
appliesTo	convolutional neural networks ⓘ feedforward neural networks ⓘ sequence models ⓘ vision models ⓘ
citationYear	2018 ⓘ
comparedTo	Batch Normalization NERFINISHED ⓘ Instance Normalization ⓘ Layer Normalization NERFINISHED ⓘ
describedIn	Group Normalization (2018) paper NERFINISHED ⓘ
doesNotDependOn	batch dimension statistics ⓘ
field	computer vision ⓘ deep learning ⓘ
goal	improve optimization of deep networks ⓘ reduce internal covariate shift ⓘ
hyperparameter	group size ⓘ
implementationDetail	groups are formed by splitting channels along the channel dimension ⓘ mean and variance are computed over spatial dimensions and group channels ⓘ
includesParameter	learnable scale (gamma) ⓘ learnable shift (beta) ⓘ
introducedBy	Kaiming He NERFINISHED ⓘ Yuxin Wu NERFINISHED ⓘ
keyIdea	divides channels into groups and normalizes within each group ⓘ
motivation	reduce dependence on batch size ⓘ stabilize training for small batch sizes ⓘ
normalizationAxis	channel groups ⓘ
normalizes	activations within each group ⓘ
oftenUsedWith	ResNet architectures NERFINISHED ⓘ convolutional layers ⓘ object detection models ⓘ segmentation models ⓘ
operatesOn	feature channels ⓘ
publishedAt	ECCV 2018 NERFINISHED ⓘ
relatedConcept	Batch Normalization NERFINISHED ⓘ Instance Normalization NERFINISHED ⓘ Layer Normalization NERFINISHED ⓘ
typicalSetting	detection and segmentation tasks with large images ⓘ small-batch training on GPUs ⓘ
usesParameter	number of groups ⓘ
usesStatistics	per-group mean ⓘ per-group variance ⓘ

Referenced by (1)

Full triples — surface form annotated when it differs from this entity's canonical label.

Layer Normalization → relatedTo → Group Normalization ⓘ