Neural Machine Translation by Jointly Learning to Align and Translate
E899030
"Neural Machine Translation by Jointly Learning to Align and Translate" is a seminal research paper that introduced an attention-based neural network architecture for machine translation, enabling models to learn soft alignments between source and target sentences during translation.
All labels observed (1)
| Label | Occurrences |
|---|---|
| Neural Machine Translation by Jointly Learning to Align and Translate canonical | 1 |
Statements (48)
| Predicate | Object |
|---|---|
| instanceOf |
computer science paper
ⓘ
research paper ⓘ scientific article ⓘ |
| affiliatedInstitutionOfAuthors | Université de Montréal NERFINISHED ⓘ |
| alignmentType | soft alignment ⓘ |
| approach | learning soft alignments between source and target words ⓘ |
| archive | arXiv NERFINISHED ⓘ |
| arxivId | 1409.0473 ⓘ |
| attentionAggregation | weighted sum of encoder hidden states ⓘ |
| attentionScoreFunction | feedforward neural network ⓘ |
| author |
Dzmitry Bahdanau
NERFINISHED
ⓘ
Kyunghyun Cho NERFINISHED ⓘ Yoshua Bengio NERFINISHED ⓘ |
| citationStatus | highly cited paper ⓘ |
| comparesWith | phrase-based statistical machine translation systems ⓘ |
| contribution |
demonstrated that attention improves translation quality for long sentences
ⓘ
showed that neural MT can learn alignments similar to traditional alignment models ⓘ |
| datasetUsed | English–French translation corpus ⓘ |
| domain | statistical machine translation ⓘ |
| evaluationMetric | BLEU score ⓘ |
| field |
artificial intelligence
ⓘ
machine learning ⓘ natural language processing ⓘ neural machine translation ⓘ |
| firstAuthor | Dzmitry Bahdanau NERFINISHED ⓘ |
| impact | established attention as a core mechanism in neural sequence modeling ⓘ |
| improvesOver | basic encoder-decoder without attention ⓘ |
| influenced |
Transformer architecture
ⓘ
subsequent attention-based neural models ⓘ |
| introducedConcept |
additive attention
ⓘ
joint learning of alignment and translation ⓘ soft attention mechanism for neural machine translation ⓘ |
| language | English ⓘ |
| learningParadigm | supervised learning ⓘ |
| lossFunction | negative log-likelihood of target sentences ⓘ |
| modelType | recurrent neural network model ⓘ |
| optimizationMethod | stochastic gradient descent ⓘ |
| proposedMethod | attention-based encoder-decoder model ⓘ |
| publicationType | arXiv preprint ⓘ |
| publicationYear | 2014 ⓘ |
| shortTitle | Jointly Learning to Align and Translate NERFINISHED ⓘ |
| subfield | sequence-to-sequence learning ⓘ |
| task |
machine translation
ⓘ
sequence-to-sequence learning ⓘ |
| title | Neural Machine Translation by Jointly Learning to Align and Translate NERFINISHED ⓘ |
| usesArchitecture | encoder-decoder architecture ⓘ |
| usesComponent |
RNN decoder with attention
ⓘ
bidirectional RNN encoder ⓘ |
Referenced by (1)
Full triples — surface form annotated when it differs from this entity's canonical label.
Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
→
relatedTo
→
Neural Machine Translation by Jointly Learning to Align and Translate
ⓘ