Latent Dirichlet Allocation
E898981
Bayesian model
bag-of-words model
generative probabilistic model
topic model
unsupervised learning method
Latent Dirichlet Allocation is a generative probabilistic model commonly used in natural language processing to discover latent topics within large collections of documents.
Statements (59)
| Predicate | Object |
|---|---|
| instanceOf |
Bayesian model
ⓘ
bag-of-words model ⓘ generative probabilistic model ⓘ topic model ⓘ unsupervised learning method ⓘ |
| appliedIn |
bioinformatics text analysis
ⓘ
digital humanities ⓘ news article analysis ⓘ scientific literature analysis ⓘ social media analysis ⓘ |
| assumes |
bag-of-words representation of documents
ⓘ
documents are mixtures of topics ⓘ topics are distributions over words ⓘ |
| basedOn |
Dirichlet distribution
NERFINISHED
ⓘ
multinomial distribution ⓘ |
| differsFrom | probabilistic latent semantic analysis by using Dirichlet priors ⓘ |
| evaluationMetric |
perplexity
ⓘ
topic coherence ⓘ |
| extends | probabilistic latent semantic analysis NERFINISHED ⓘ |
| field |
machine learning
ⓘ
natural language processing ⓘ statistics ⓘ |
| hasAbbreviation | LDA NERFINISHED ⓘ |
| hasComponent |
topic distribution per document
ⓘ
word distribution per topic ⓘ |
| hasHyperparameter |
alpha
ⓘ
beta ⓘ |
| hyperparameterAlphaControls | document-topic sparsity ⓘ |
| hyperparameterBetaControls | topic-word sparsity ⓘ |
| implementedIn |
Gensim
NERFINISHED
ⓘ
MALLET NERFINISHED ⓘ Stan NERFINISHED ⓘ scikit-learn NERFINISHED ⓘ |
| inferenceMethod |
collapsed Gibbs sampling
ⓘ
expectation-maximization ⓘ online variational Bayes ⓘ variational inference ⓘ |
| input | corpus of documents ⓘ |
| introducedBy |
Andrew Y. Ng
NERFINISHED
ⓘ
David M. Blei NERFINISHED ⓘ Michael I. Jordan NERFINISHED ⓘ |
| introducedInPaper | Latent Dirichlet Allocation NERFINISHED ⓘ |
| output |
set of topics
ⓘ
topic proportions for each document ⓘ word distribution for each topic ⓘ |
| publicationYear | 2003 ⓘ |
| publishedIn | Journal of Machine Learning Research NERFINISHED ⓘ |
| relatedTo |
latent semantic analysis
NERFINISHED
ⓘ
probabilistic latent semantic analysis NERFINISHED ⓘ |
| requires | predefined number of topics ⓘ |
| usedFor |
content-based recommendation
ⓘ
dimensionality reduction ⓘ document classification preprocessing ⓘ document clustering ⓘ feature extraction ⓘ information retrieval ⓘ recommender systems ⓘ text mining ⓘ topic discovery ⓘ |
Referenced by (1)
Full triples — surface form annotated when it differs from this entity's canonical label.