Reformer architecture

E899032

Transformer-based model neural network architecture

The Reformer architecture is a neural network model that improves Transformer efficiency by using locality-sensitive hashing attention and reversible layers to greatly reduce memory and computational costs.

Jump to: Statements Referenced by

Statements (48)

Predicate	Object
instanceOf	Transformer-based model ⓘ neural network architecture ⓘ
aimsTo	improve Transformer efficiency ⓘ reduce computational cost ⓘ reduce memory usage ⓘ
attentionRestriction	within hash buckets ⓘ
basedOn	Transformer architecture ⓘ
belongsTo	efficient Transformer family ⓘ
comparedTo	standard Transformer ⓘ
competesWith	Linformer NERFINISHED ⓘ Longformer NERFINISHED ⓘ Sparse Transformer NERFINISHED ⓘ
designedFor	large-context tasks ⓘ long sequence modeling ⓘ memory-efficient training ⓘ
evaluationDomain	language modeling benchmarks ⓘ
groupsTokensBy	hash codes ⓘ
hasComponent	LSH-based self-attention layer ⓘ position-wise feed-forward network ⓘ reversible residual block ⓘ
hasKeyFeature	chunked feed-forward layers ⓘ locality-sensitive hashing attention ⓘ reversible residual layers ⓘ shared query-key projection ⓘ
implementedIn	JAX NERFINISHED ⓘ PyTorch NERFINISHED ⓘ TensorFlow NERFINISHED ⓘ
inspiredBy	locality-sensitive hashing methods ⓘ
introducedInPaper	Reformer: The Efficient Transformer NERFINISHED ⓘ
memoryOptimizationTechnique	activation recomputation from outputs ⓘ reversible residual computation ⓘ
optimizationGoal	scalability to very long sequences ⓘ
proposedBy	Anselm Levskaya NERFINISHED ⓘ Nikita Kitaev NERFINISHED ⓘ Łukasz Kaiser NERFINISHED ⓘ
publicationYear	2020 ⓘ
publishedBy	Google researchers NERFINISHED ⓘ
reduces	activation memory footprint ⓘ attention computation cost ⓘ
reducesComplexityOf	self-attention ⓘ
standardTransformerComplexity	O(L^2) ⓘ
supports	autoregressive language modeling ⓘ sequence-to-sequence tasks ⓘ
targetComplexity	O(L log L) ⓘ
uses	LSH attention ⓘ locality-sensitive hashing ⓘ reversible layers ⓘ
usesSorting	hash buckets for attention ⓘ

Referenced by (1)

Full triples — surface form annotated when it differs from this entity's canonical label.

Lukasz Kaiser → knownFor → Reformer architecture ⓘ

subject surface form: Łukasz Kaiser