Reformer architecture

E899032

The Reformer architecture is a neural network model that improves Transformer efficiency by using locality-sensitive hashing attention and reversible layers to greatly reduce memory and computational costs.

Jump to: Statements Referenced by

Statements (48)

Predicate Object
instanceOf Transformer-based model
neural network architecture
aimsTo improve Transformer efficiency
reduce computational cost
reduce memory usage
attentionRestriction within hash buckets
basedOn Transformer architecture
belongsTo efficient Transformer family
comparedTo standard Transformer
competesWith Linformer NERFINISHED
Longformer NERFINISHED
Sparse Transformer NERFINISHED
designedFor large-context tasks
long sequence modeling
memory-efficient training
evaluationDomain language modeling benchmarks
groupsTokensBy hash codes
hasComponent LSH-based self-attention layer
position-wise feed-forward network
reversible residual block
hasKeyFeature chunked feed-forward layers
locality-sensitive hashing attention
reversible residual layers
shared query-key projection
implementedIn JAX NERFINISHED
PyTorch NERFINISHED
TensorFlow NERFINISHED
inspiredBy locality-sensitive hashing methods
introducedInPaper Reformer: The Efficient Transformer NERFINISHED
memoryOptimizationTechnique activation recomputation from outputs
reversible residual computation
optimizationGoal scalability to very long sequences
proposedBy Anselm Levskaya NERFINISHED
Nikita Kitaev NERFINISHED
Łukasz Kaiser NERFINISHED
publicationYear 2020
publishedBy Google researchers NERFINISHED
reduces activation memory footprint
attention computation cost
reducesComplexityOf self-attention
standardTransformerComplexity O(L^2)
supports autoregressive language modeling
sequence-to-sequence tasks
targetComplexity O(L log L)
uses LSH attention
locality-sensitive hashing
reversible layers
usesSorting hash buckets for attention

Referenced by (1)

Full triples — surface form annotated when it differs from this entity's canonical label.

Lukasz Kaiser knownFor Reformer architecture
subject surface form: Łukasz Kaiser