word2vec

E906310

word2vec is a neural network-based technique for learning dense vector representations of words that capture semantic and syntactic relationships, widely used in natural language processing.

Jump to: Surface forms Statements Referenced by

Observed surface forms (1)

Surface form Occurrences
word2vec algorithm 1

Statements (48)

Predicate Object
instanceOf distributional semantics model
natural language processing technique
neural network-based representation learning method
word embedding model
basedOn distributional hypothesis
neural networks
captures semantic relationships between words
syntactic relationships between words
category unsupervised learning
developedAt Google NERFINISHED
developedBy Tomas Mikolov NERFINISHED
domain computational linguistics
natural language processing
embeddingDimension typically 100–300
exampleProperty king - man + woman ≈ queen
hasArchitecture Continuous Bag-of-Words (CBOW) NERFINISHED
Skip-gram NERFINISHED
implementedIn Gensim NERFINISHED
PyTorch NERFINISHED
TensorFlow NERFINISHED
inputUnit word tokens
inspired GloVe NERFINISHED
fastText NERFINISHED
many neural word embedding methods
introducedInPaper Efficient Estimation of Word Representations in Vector Space NERFINISHED
introducedInYear 2013
language C (original implementation)
Python (reference implementations)
license Apache-style open source (original code)
optimizationTechnique hierarchical softmax
negative sampling
output word embeddings
popularized vector arithmetic on words
representationType continuous vector space
dense vectors
scalesTo billions of tokens
supports large vocabularies
task learning dense vector representations of words
trainingDataType unlabeled text corpora
trainingObjective predict context words from target word (Skip-gram)
predict target word from context (CBOW)
usedFor feature extraction for NLP models
information retrieval
machine translation (as component)
semantic clustering
text classification
word analogy tasks
word similarity

Referenced by (2)

Full triples — surface form annotated when it differs from this entity's canonical label.

Tomas Mikolov developed word2vec
this entity surface form: word2vec algorithm
Tomas Mikolov knownFor word2vec