neural tangent kernel
E457875
The neural tangent kernel is a theoretical construct that characterizes the training dynamics and generalization of infinitely wide neural networks by relating gradient descent to kernel methods.
Statements (47)
| Predicate | Object |
|---|---|
| instanceOf |
kernel method
ⓘ
reproducing kernel ⓘ theoretical construct in machine learning ⓘ |
| appliesTo |
convolutional neural networks
ⓘ
fully connected neural networks ⓘ residual neural networks ⓘ |
| approximates | finite-width network training dynamics when width is large ⓘ |
| associatedWith |
gradient flow
ⓘ
infinite-width limit of neural networks ⓘ linearization of neural networks around initialization ⓘ |
| characterizes |
generalization of infinitely wide neural networks
ⓘ
training dynamics of infinitely wide neural networks ⓘ |
| contrastedWith |
feature-learning regime of neural networks
ⓘ
finite-width non-linear training dynamics ⓘ |
| definedIn | “Neural Tangent Kernel: Convergence and Generalization in Neural Networks” NERFINISHED ⓘ |
| dependsOn |
activation function
ⓘ
network architecture ⓘ parameter initialization distribution ⓘ |
| describes |
evolution of network outputs under gradient descent
ⓘ
function space dynamics of neural networks ⓘ |
| field |
deep learning theory
ⓘ
machine learning ⓘ statistical learning theory ⓘ |
| formalism | kernel defined by inner products of network parameter gradients with respect to inputs ⓘ |
| framework |
lazy training regime
ⓘ
linearized neural network training ⓘ |
| hasVariant |
convolutional neural tangent kernel
ⓘ
graph neural tangent kernel ⓘ neural tangent kernel for residual networks ⓘ |
| inspired |
subsequent work on feature learning beyond the NTK regime
ⓘ
subsequent work on wide-network generalization bounds ⓘ |
| introducedBy |
Arthur Jacot
NERFINISHED
ⓘ
Clément Hongler NERFINISHED ⓘ Franck Gabriel NERFINISHED ⓘ |
| mathematicallyRelatedTo |
Gaussian process limits of neural networks
NERFINISHED
ⓘ
kernel ridge regression ⓘ random feature models ⓘ |
| property |
induces a kernel regression predictor at convergence
ⓘ
is positive semi-definite ⓘ remains constant during training in the infinite-width limit ⓘ |
| publicationYear | 2018 ⓘ |
| relatesTo |
gradient descent
ⓘ
kernel methods ⓘ |
| usedFor |
analyzing convergence of training in overparameterized networks
ⓘ
analyzing generalization in overparameterized networks ⓘ connecting neural networks to kernel regression ⓘ studying wide-network limits ⓘ |
Referenced by (1)
Full triples — surface form annotated when it differs from this entity's canonical label.