Neural Machine Translation by Jointly Learning to Align and Translate

E899030

"Neural Machine Translation by Jointly Learning to Align and Translate" is a seminal research paper that introduced an attention-based neural network architecture for machine translation, enabling models to learn soft alignments between source and target sentences during translation.

Try in SPARQL Jump to: Surface forms Statements Referenced by

All labels observed (1)

Statements (48)

Predicate Object
instanceOf computer science paper
research paper
scientific article
affiliatedInstitutionOfAuthors Université de Montréal NERFINISHED
alignmentType soft alignment
approach learning soft alignments between source and target words
archive arXiv NERFINISHED
arxivId 1409.0473
attentionAggregation weighted sum of encoder hidden states
attentionScoreFunction feedforward neural network
author Dzmitry Bahdanau NERFINISHED
Kyunghyun Cho NERFINISHED
Yoshua Bengio NERFINISHED
citationStatus highly cited paper
comparesWith phrase-based statistical machine translation systems
contribution demonstrated that attention improves translation quality for long sentences
showed that neural MT can learn alignments similar to traditional alignment models
datasetUsed English–French translation corpus
domain statistical machine translation
evaluationMetric BLEU score
field artificial intelligence
machine learning
natural language processing
neural machine translation
firstAuthor Dzmitry Bahdanau NERFINISHED
impact established attention as a core mechanism in neural sequence modeling
improvesOver basic encoder-decoder without attention
influenced Transformer architecture
subsequent attention-based neural models
introducedConcept additive attention
joint learning of alignment and translation
soft attention mechanism for neural machine translation
language English
learningParadigm supervised learning
lossFunction negative log-likelihood of target sentences
modelType recurrent neural network model
optimizationMethod stochastic gradient descent
proposedMethod attention-based encoder-decoder model
publicationType arXiv preprint
publicationYear 2014
shortTitle Jointly Learning to Align and Translate NERFINISHED
subfield sequence-to-sequence learning
task machine translation
sequence-to-sequence learning
title Neural Machine Translation by Jointly Learning to Align and Translate NERFINISHED
usesArchitecture encoder-decoder architecture
usesComponent RNN decoder with attention
bidirectional RNN encoder

Referenced by (1)

Full triples — surface form annotated when it differs from this entity's canonical label.

Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation relatedTo Neural Machine Translation by Jointly Learning to Align and Translate