KNN
E605673
lazy learning algorithm
machine learning algorithm
non-parametric method
supervised learning algorithm
KNN (k-nearest neighbors) is a simple, non-parametric machine learning algorithm used for classification and regression by predicting labels based on the closest training examples in the feature space.
Statements (50)
| Predicate | Object |
|---|---|
| instanceOf |
lazy learning algorithm
ⓘ
machine learning algorithm ⓘ non-parametric method ⓘ supervised learning algorithm ⓘ |
| advantage |
can model complex decision boundaries
ⓘ
naturally supports multi-class classification ⓘ simple to implement ⓘ |
| assumption | nearby points in feature space tend to have similar labels ⓘ |
| category | distance-based learning method ⓘ |
| commonDistanceMetric |
Euclidean distance
ⓘ
Manhattan distance ⓘ Minkowski distance ⓘ cosine distance ⓘ |
| coreIdea | predicts labels based on nearest training examples in feature space ⓘ |
| decisionRule |
average of target values of k nearest neighbors for regression
ⓘ
majority vote among k nearest neighbors for classification ⓘ |
| disadvantage |
high memory usage because it stores all training data
ⓘ
performance degrades in high-dimensional spaces ⓘ slow for large training sets ⓘ |
| fullName | k-nearest neighbors NERFINISHED ⓘ |
| hyperparameter |
distance metric
ⓘ
k ⓘ |
| implementedIn |
MATLAB Statistics and Machine Learning Toolbox
NERFINISHED
ⓘ
R caret package NERFINISHED ⓘ scikit-learn NERFINISHED ⓘ |
| improvementTechnique |
KD-tree indexing
NERFINISHED
ⓘ
approximate nearest neighbor search ⓘ ball tree indexing ⓘ dimensionality reduction ⓘ feature scaling ⓘ feature selection ⓘ |
| introducedIn | pattern recognition literature of the 1960s ⓘ |
| outputType |
continuous values for regression
ⓘ
discrete class labels for classification ⓘ |
| parameterSelectionMethod | cross-validation for choosing k ⓘ |
| predictionPhase | computes distances to training instances ⓘ |
| property |
computationally expensive at prediction time
ⓘ
instance-based learner ⓘ non-parametric because it makes no strong assumptions about data distribution ⓘ sensitive to feature scaling ⓘ sensitive to irrelevant features ⓘ |
| relatedConcept | curse of dimensionality ⓘ |
| requires | labeled training data ⓘ |
| trainingPhase | stores training instances without building an explicit model ⓘ |
| typicalApplication |
image classification
ⓘ
pattern recognition ⓘ recommendation systems ⓘ text categorization ⓘ |
| usedFor |
classification
ⓘ
regression ⓘ |
Referenced by (1)
Full triples — surface form annotated when it differs from this entity's canonical label.