AdaGrad

E565192

AdaGrad is an adaptive gradient descent optimization algorithm that adjusts learning rates for individual parameters based on their historical gradients, often improving convergence in sparse settings.

Try in SPARQL Jump to: Surface forms Statements Referenced by

Observed surface forms (1)

Surface form Occurrences
AMSGrad 1

Statements (47)

Predicate Object
instanceOf adaptive learning rate method
optimization algorithm
appliedIn computer vision
natural language processing
online learning
recommender systems
stochastic gradient descent variants
basedOn gradient descent
category first-order optimization method
comparedWith Adam NERFINISHED
RMSProp NERFINISHED
SGD NERFINISHED
defines G_t as sum of past squared gradients
describedIn Adaptive Subgradient Methods for Online Learning and Stochastic Optimization NERFINISHED
fullName Adaptive Gradient Algorithm NERFINISHED
hasProperty accumulates squared gradients
adaptive learning rate
diagonal preconditioning
element-wise parameter updates
monotonically decreasing learning rates
no need for manual learning rate decay schedule
often improves convergence in sparse settings
per-parameter learning rates
scale-invariant to gradient magnitude
sensitive to learning rate hyperparameter
well-suited for sparse data
implementedIn PyTorch NERFINISHED
TensorFlow NERFINISHED
scikit-learn NERFINISHED
influenced Adadelta
Adam NERFINISHED
RMSProp NERFINISHED
introducedIn 2011
limitation learning rate can become too small over time
may converge slowly in non-sparse settings
operatesOn model parameters
stochastic gradients
proposedBy Elad Hazan NERFINISHED
John Duchi NERFINISHED
Yoram Singer NERFINISHED
publishedAt Journal of Machine Learning Research NERFINISHED
updateRule theta_t = theta_{t-1} - (eta / (sqrt(G_t) + epsilon)) * g_t
usedFor optimizing objective functions
stochastic optimization
training machine learning models
uses epsilon for numerical stability
global initial learning rate

Referenced by (7)

Full triples — surface form annotated when it differs from this entity's canonical label.

RMSProp relatedTo AdaGrad
RMSProp improvesOn AdaGrad
Adam optimizer relatedTo AdaGrad
Adam optimizer inspiredBy AdaGrad
Adam optimizer hasVariant AdaGrad
this entity surface form: AMSGrad
MXNet supportsOptimization AdaGrad