Attention Is All You Need

E457850

computer science paper research paper scientific paper

"Attention Is All You Need" is the landmark 2017 research paper that introduced the Transformer architecture and revolutionized modern natural language processing and sequence modeling.

Try in SPARQL Jump to: Surface forms Statements Referenced by

All labels observed (1)

Label	Occurrences
Attention Is All You Need canonical	2

Statements (53)

Predicate	Object
instanceOf	computer science paper ⓘ research paper ⓘ scientific paper ⓘ
affiliatedInstitution	Google Brain NERFINISHED ⓘ Google Research NERFINISHED ⓘ
applicationDomain	machine translation ⓘ
architectureType	encoder-decoder ⓘ
benchmarkDataset	WMT 2014 English-to-French translation NERFINISHED ⓘ WMT 2014 English-to-German translation NERFINISHED ⓘ
citationStatus	highly cited paper ⓘ
enabled	parallel training of sequence models ⓘ
field	deep learning ⓘ machine learning ⓘ natural language processing ⓘ sequence modeling ⓘ
hasAuthor	Aidan N. Gomez NERFINISHED ⓘ Ashish Vaswani NERFINISHED ⓘ Illia Polosukhin NERFINISHED ⓘ Jakob Uszkoreit NERFINISHED ⓘ Llion Jones NERFINISHED ⓘ Niki Parmar NERFINISHED ⓘ Noam Shazeer NERFINISHED ⓘ Łukasz Kaiser NERFINISHED ⓘ
impact	became foundational for large language models ⓘ revolutionized modern natural language processing ⓘ
inspiredModel	BERT NERFINISHED ⓘ GPT series NERFINISHED ⓘ T5 NERFINISHED ⓘ
introducedConcept	Transformer architecture ⓘ multi-head attention ⓘ positional encoding ⓘ scaled dot-product attention ⓘ self-attention mechanism ⓘ
optimizationMethod	Adam optimizer NERFINISHED ⓘ
outperformed	previous state-of-the-art machine translation models ⓘ
proposedModel	Transformer NERFINISHED ⓘ
publicationYear	2017 ⓘ
publishedIn	Advances in Neural Information Processing Systems 30 NERFINISHED ⓘ
publishedInConference	NeurIPS 2017 NERFINISHED ⓘ
publisher	Neural Information Processing Systems Foundation NERFINISHED ⓘ
reduced	sequential computation in sequence models ⓘ
replacedArchitecture	GRU networks ⓘ LSTM networks NERFINISHED ⓘ recurrent neural networks ⓘ
title	Attention Is All You Need NERFINISHED ⓘ
usesComponent	dropout regularization ⓘ layer normalization ⓘ multi-head self-attention layers ⓘ position-wise feed-forward networks ⓘ residual connections ⓘ stacked decoder layers ⓘ stacked encoder layers ⓘ
usesTechnique	label smoothing ⓘ

Referenced by (2)

Full triples — surface form annotated when it differs from this entity's canonical label.

Transformer → introducedInPaper → Attention Is All You Need ⓘ

Lukasz Kaiser → coAuthorOf → Attention Is All You Need ⓘ

subject surface form: Łukasz Kaiser