Reformer: The Efficient Transformer

E899034

machine learning paper neural network architecture research paper scientific publication

Reformer: The Efficient Transformer is a research paper introducing a more memory- and computation-efficient Transformer architecture using techniques like locality-sensitive hashing attention and reversible layers.

Try in SPARQL Jump to: Statements Referenced by

Statements (48)

Predicate	Object
instanceOf	machine learning paper ⓘ neural network architecture ⓘ research paper ⓘ scientific publication ⓘ
addresses	quadratic memory complexity of self-attention ⓘ quadratic time complexity of self-attention ⓘ
aimsTo	enable training on very long sequences ⓘ reduce computational cost of Transformers ⓘ reduce memory usage of Transformers ⓘ
applicationDomain	language modeling ⓘ long-context tasks ⓘ sequence modeling ⓘ
assumes	similar tokens attend mostly to each other ⓘ
basedOn	Transformer architecture ⓘ
category	efficient Transformer variant ⓘ
complexityClaim	reduces attention complexity from O(L^2) to approximately O(L log L) ⓘ
contribution	demonstrates training on sequences with tens of thousands of tokens ⓘ shows LSH attention can approximate full attention with lower cost ⓘ shows reversible layers can significantly reduce activation memory ⓘ
field	deep learning ⓘ machine learning ⓘ natural language processing ⓘ neural networks ⓘ
goal	scale Transformers to longer sequences without prohibitive resource usage ⓘ
improvesOn	standard Transformer ⓘ
influencedBy	Reversible residual networks NERFINISHED ⓘ locality-sensitive hashing ⓘ
introducesTechnique	chunked feed-forward layers ⓘ locality-sensitive hashing attention ⓘ reversible residual layers ⓘ shared query-key projections for attention ⓘ
LSHAttentionProperty	computes attention only within buckets ⓘ groups similar queries into buckets ⓘ
optimizationTarget	memory efficiency ⓘ time efficiency ⓘ
proposes	Reformer architecture ⓘ
relatedTo	Linformer NERFINISHED ⓘ Longformer NERFINISHED ⓘ Performer ⓘ Transformer NERFINISHED ⓘ sparse attention models ⓘ
reversibleLayerProperty	reconstructs intermediate activations during backpropagation ⓘ stores only activations at boundaries between layers ⓘ
title	Reformer: The Efficient Transformer NERFINISHED ⓘ
uses	approximate nearest neighbor search via LSH ⓘ position-wise feed-forward networks ⓘ reversible layers to recompute activations ⓘ self-attention mechanism ⓘ

Referenced by (1)

Full triples — surface form annotated when it differs from this entity's canonical label.

Lukasz Kaiser → coAuthorOf → Reformer: The Efficient Transformer ⓘ

subject surface form: Łukasz Kaiser