Transformer encoder-only
E457857
A Transformer encoder-only model is a neural network architecture that uses only the encoder stack of the Transformer to process input sequences, typically for tasks like classification, retrieval, and masked language modeling.
Statements (54)
| Predicate | Object |
|---|---|
| instanceOf |
Transformer-based model
ⓘ
neural network architecture ⓘ |
| advantage |
captures bidirectional context
ⓘ
parallelizable over sequence positions ⓘ |
| attentionDirection | bidirectional ⓘ |
| attentionType | self-attention ⓘ |
| basedOn | Transformer architecture ⓘ |
| canBe |
fine-tuned model
ⓘ
pretrained language model ⓘ |
| canBeAppliedTo |
multimodal tasks
ⓘ
vision tasks ⓘ |
| commonlyImplementedIn |
JAX
NERFINISHED
ⓘ
PyTorch NERFINISHED ⓘ TensorFlow NERFINISHED ⓘ |
| doesNotUseComponent | Transformer decoder stack ⓘ |
| domain | natural language processing ⓘ |
| hasComponent |
layer normalization
ⓘ
multi-head self-attention layer ⓘ position-wise feed-forward network ⓘ positional encoding ⓘ residual connections ⓘ |
| hyperparameter |
dropout rate
ⓘ
hidden size ⓘ intermediate feed-forward size ⓘ maximum sequence length ⓘ number of attention heads ⓘ number of encoder layers ⓘ |
| inputType |
continuous embeddings
ⓘ
token sequences ⓘ |
| limitation | not directly suitable for autoregressive generation ⓘ |
| outputType |
pooled representation
ⓘ
sequence representations ⓘ |
| relatedModel |
ALBERT
NERFINISHED
ⓘ
BERT NERFINISHED ⓘ DeBERTa NERFINISHED ⓘ DistilBERT NERFINISHED ⓘ ELECTRA NERFINISHED ⓘ RoBERTa NERFINISHED ⓘ |
| trainingObjective |
classification loss
ⓘ
contrastive loss ⓘ masked language modeling loss ⓘ metric learning loss ⓘ |
| typicalUse |
document classification
ⓘ
document embedding ⓘ information retrieval ⓘ masked language modeling ⓘ named entity recognition ⓘ semantic search ⓘ sentence classification ⓘ sentence embedding ⓘ sequence tagging ⓘ text classification ⓘ token classification ⓘ |
| usesComponent | Transformer encoder stack NERFINISHED ⓘ |
Referenced by (1)
Full triples — surface form annotated when it differs from this entity's canonical label.