word2vec

E906310

distributional semantics model natural language processing technique neural network-based representation learning method word embedding model

word2vec is a neural network-based technique for learning dense vector representations of words that capture semantic and syntactic relationships, widely used in natural language processing.

Jump to: Surface forms Statements Referenced by

Observed surface forms (1)

Surface form	Occurrences
word2vec algorithm	1

Statements (48)

Predicate	Object
instanceOf	distributional semantics model ⓘ natural language processing technique ⓘ neural network-based representation learning method ⓘ word embedding model ⓘ
basedOn	distributional hypothesis ⓘ neural networks ⓘ
captures	semantic relationships between words ⓘ syntactic relationships between words ⓘ
category	unsupervised learning ⓘ
developedAt	Google NERFINISHED ⓘ
developedBy	Tomas Mikolov NERFINISHED ⓘ
domain	computational linguistics ⓘ natural language processing ⓘ
embeddingDimension	typically 100–300 ⓘ
exampleProperty	king - man + woman ≈ queen ⓘ
hasArchitecture	Continuous Bag-of-Words (CBOW) NERFINISHED ⓘ Skip-gram NERFINISHED ⓘ
implementedIn	Gensim NERFINISHED ⓘ PyTorch NERFINISHED ⓘ TensorFlow NERFINISHED ⓘ
inputUnit	word tokens ⓘ
inspired	GloVe NERFINISHED ⓘ fastText NERFINISHED ⓘ many neural word embedding methods ⓘ
introducedInPaper	Efficient Estimation of Word Representations in Vector Space NERFINISHED ⓘ
introducedInYear	2013 ⓘ
language	C (original implementation) ⓘ Python (reference implementations) ⓘ
license	Apache-style open source (original code) ⓘ
optimizationTechnique	hierarchical softmax ⓘ negative sampling ⓘ
output	word embeddings ⓘ
popularized	vector arithmetic on words ⓘ
representationType	continuous vector space ⓘ dense vectors ⓘ
scalesTo	billions of tokens ⓘ
supports	large vocabularies ⓘ
task	learning dense vector representations of words ⓘ
trainingDataType	unlabeled text corpora ⓘ
trainingObjective	predict context words from target word (Skip-gram) ⓘ predict target word from context (CBOW) ⓘ
usedFor	feature extraction for NLP models ⓘ information retrieval ⓘ machine translation (as component) ⓘ semantic clustering ⓘ text classification ⓘ word analogy tasks ⓘ word similarity ⓘ

Referenced by (2)

Full triples — surface form annotated when it differs from this entity's canonical label.

Tomas Mikolov → developed → word2vec ⓘ

this entity surface form: word2vec algorithm

Tomas Mikolov → knownFor → word2vec ⓘ