Attention Is All You Need
E457850
"Attention Is All You Need" is the landmark 2017 research paper that introduced the Transformer architecture and revolutionized modern natural language processing and sequence modeling.
All labels observed (1)
| Label | Occurrences |
|---|---|
| Attention Is All You Need canonical | 2 |
Statements (53)
| Predicate | Object |
|---|---|
| instanceOf |
computer science paper
ⓘ
research paper ⓘ scientific paper ⓘ |
| affiliatedInstitution |
Google Brain
NERFINISHED
ⓘ
Google Research NERFINISHED ⓘ |
| applicationDomain | machine translation ⓘ |
| architectureType | encoder-decoder ⓘ |
| benchmarkDataset |
WMT 2014 English-to-French translation
NERFINISHED
ⓘ
WMT 2014 English-to-German translation NERFINISHED ⓘ |
| citationStatus | highly cited paper ⓘ |
| enabled | parallel training of sequence models ⓘ |
| field |
deep learning
ⓘ
machine learning ⓘ natural language processing ⓘ sequence modeling ⓘ |
| hasAuthor |
Aidan N. Gomez
NERFINISHED
ⓘ
Ashish Vaswani NERFINISHED ⓘ Illia Polosukhin NERFINISHED ⓘ Jakob Uszkoreit NERFINISHED ⓘ Llion Jones NERFINISHED ⓘ Niki Parmar NERFINISHED ⓘ Noam Shazeer NERFINISHED ⓘ Łukasz Kaiser NERFINISHED ⓘ |
| impact |
became foundational for large language models
ⓘ
revolutionized modern natural language processing ⓘ |
| inspiredModel |
BERT
NERFINISHED
ⓘ
GPT series NERFINISHED ⓘ T5 NERFINISHED ⓘ |
| introducedConcept |
Transformer architecture
ⓘ
multi-head attention ⓘ positional encoding ⓘ scaled dot-product attention ⓘ self-attention mechanism ⓘ |
| optimizationMethod | Adam optimizer NERFINISHED ⓘ |
| outperformed | previous state-of-the-art machine translation models ⓘ |
| proposedModel | Transformer NERFINISHED ⓘ |
| publicationYear | 2017 ⓘ |
| publishedIn | Advances in Neural Information Processing Systems 30 NERFINISHED ⓘ |
| publishedInConference | NeurIPS 2017 NERFINISHED ⓘ |
| publisher | Neural Information Processing Systems Foundation NERFINISHED ⓘ |
| reduced | sequential computation in sequence models ⓘ |
| replacedArchitecture |
GRU networks
ⓘ
LSTM networks NERFINISHED ⓘ recurrent neural networks ⓘ |
| title | Attention Is All You Need NERFINISHED ⓘ |
| usesComponent |
dropout regularization
ⓘ
layer normalization ⓘ multi-head self-attention layers ⓘ position-wise feed-forward networks ⓘ residual connections ⓘ stacked decoder layers ⓘ stacked encoder layers ⓘ |
| usesTechnique | label smoothing ⓘ |
Referenced by (2)
Full triples — surface form annotated when it differs from this entity's canonical label.
subject surface form:
Łukasz Kaiser