Adam

E701497

Adam is a widely used stochastic optimization algorithm in machine learning that combines ideas from momentum and adaptive learning rates to efficiently train deep neural networks.

Try in SPARQL Jump to: Statements Referenced by

Statements (46)

Predicate Object
instanceOf optimization algorithm
stochastic optimization method
abbreviationFor Adaptive Moment Estimation NERFINISHED
appliedIn computer vision
natural language processing
reinforcement learning
speech recognition
basedOn adaptive learning rates
momentum
stochastic gradient descent
commonVariant AMSGrad NERFINISHED
AdamW NERFINISHED
comparedWith AdaGrad NERFINISHED
RMSProp NERFINISHED
SGD with momentum NERFINISHED
defaultHyperparameter beta1 = 0.9
beta2 = 0.999
epsilon = 1e-8
learning rate = 0.001
describedIn Adam: A Method for Stochastic Optimization NERFINISHED
field deep learning
machine learning
hasProperty computationally efficient
handles sparse gradients
memory efficient
scale invariant to gradient magnitudes
suitable for high-dimensional parameter spaces
suitable for large datasets
implementedIn JAX NERFINISHED
Keras NERFINISHED
PyTorch NERFINISHED
TensorFlow NERFINISHED
introducedIn 2014
optimizationType first-order method
performs bias correction of moment estimates
proposedBy Diederik P. Kingma NERFINISHED
Jimmy Ba NERFINISHED
publishedAt International Conference on Learning Representations NERFINISHED
publishedIn 2015
updates parameters with element-wise adaptive learning rates
usedFor stochastic optimization
training deep neural networks
uses exponentially decaying averages of past gradients
exponentially decaying averages of past squared gradients
first moment estimates of gradients
second moment estimates of gradients

Referenced by (2)

Full triples — surface form annotated when it differs from this entity's canonical label.