AdaDelta
E565193
AdaDelta is an adaptive learning rate optimization algorithm for training neural networks that improves upon methods like RMSProp by eliminating the need to manually set a global learning rate.
Statements (39)
| Predicate | Object |
|---|---|
| instanceOf |
adaptive learning rate method
ⓘ
optimization algorithm ⓘ stochastic gradient-based optimization method ⓘ |
| appliedIn |
computer vision
ⓘ
natural language processing ⓘ speech recognition ⓘ |
| basedOn | stochastic gradient descent ⓘ |
| comparedWith |
AdaGrad
NERFINISHED
ⓘ
Adam NERFINISHED ⓘ Momentum NERFINISHED ⓘ Nesterov momentum ⓘ RMSProp NERFINISHED ⓘ SGD NERFINISHED ⓘ |
| describedIn | ADADELTA: An Adaptive Learning Rate Method NERFINISHED ⓘ |
| field |
deep learning
ⓘ
machine learning ⓘ |
| goal |
accelerate convergence in deep networks
ⓘ
improve training stability ⓘ reduce sensitivity to initial learning rate choice ⓘ |
| hasCharacteristic |
adaptive per-parameter learning rates
ⓘ
no need for manual global learning rate ⓘ robust to choice of hyperparameters ⓘ scale-invariant update rule ⓘ uses running averages of squared gradients ⓘ uses running averages of squared parameter updates ⓘ |
| hasHyperparameter |
decay rate rho
ⓘ
epsilon ⓘ |
| implementedIn |
Keras
NERFINISHED
ⓘ
MXNet NERFINISHED ⓘ PyTorch NERFINISHED ⓘ TensorFlow NERFINISHED ⓘ Theano NERFINISHED ⓘ |
| improvesUpon | RMSProp NERFINISHED ⓘ |
| introducedBy | Matthew D. Zeiler NERFINISHED ⓘ |
| optimizationType | first-order method ⓘ |
| publicationYear | 2012 ⓘ |
| updateRule | uses ratio of accumulated gradients to accumulated updates ⓘ |
| usedFor |
minimizing loss functions
ⓘ
training neural networks ⓘ |
Referenced by (2)
Full triples — surface form annotated when it differs from this entity's canonical label.