minimum description length principle
E700157
The minimum description length principle is a formal method in statistics and machine learning that selects the best explanation for data as the one that yields the shortest overall description of both the model and the data it encodes.
Statements (49)
| Predicate | Object |
|---|---|
| instanceOf |
information-theoretic principle
ⓘ
model selection principle ⓘ statistical learning principle ⓘ |
| appliedIn |
classification
ⓘ
clustering ⓘ feature selection ⓘ hypothesis testing ⓘ model selection for graphical models ⓘ pattern recognition ⓘ regression ⓘ time series modeling ⓘ |
| assumes |
data are encoded with an optimal code relative to the model
ⓘ
shorter codes correspond to more probable regularities ⓘ |
| basedOn |
Kolmogorov complexity
NERFINISHED
ⓘ
algorithmic information theory NERFINISHED ⓘ information theory NERFINISHED ⓘ |
| comparedTo |
Bayesian marginal likelihood
NERFINISHED
ⓘ
cross-validation ⓘ |
| contrastsWith | maximum likelihood principle ⓘ |
| coreIdea |
prefer the model that gives the shortest total description of model and data
ⓘ
trade off model complexity and goodness of fit ⓘ |
| criterionType | universal coding-based criterion ⓘ |
| field |
machine learning
ⓘ
model selection ⓘ statistical inference ⓘ statistics ⓘ |
| formalizes | Occam's razor NERFINISHED ⓘ |
| hasVariant |
normalized maximum likelihood MDL
ⓘ
refined MDL ⓘ two-part MDL ⓘ |
| implies |
penalty for model complexity
ⓘ
preference for simpler models that fit data well ⓘ |
| introducedBy | Jorma Rissanen NERFINISHED ⓘ |
| objective |
achieve good generalization
ⓘ
avoid overfitting ⓘ minimize total code length of model and data ⓘ |
| relatedTo |
Akaike information criterion
NERFINISHED
ⓘ
Bayesian information criterion NERFINISHED ⓘ Bayesian model selection NERFINISHED ⓘ minimum message length principle ⓘ |
| usedFor |
choosing model class
ⓘ
choosing model order ⓘ regularization design ⓘ |
| usesConcept |
code length
ⓘ
description length ⓘ lossless data compression ⓘ stochastic complexity ⓘ two-part code ⓘ universal coding ⓘ |
Referenced by (1)
Full triples — surface form annotated when it differs from this entity's canonical label.