Actor-Critic using Kronecker-Factored Trust Region

E441103

Actor-Critic using Kronecker-Factored Trust Region (ACKTR) is a reinforcement learning algorithm that improves sample efficiency and stability by applying Kronecker-factored approximate curvature to natural gradient updates in actor-critic methods.

Jump to: Statements Referenced by

Statements (46)

Predicate Object
instanceOf actor-critic method
policy gradient method
reinforcement learning algorithm
abbreviation ACKTR NERFINISHED
aimsTo improve sample efficiency
improve training stability
appliedTo policy parameters
value function parameters
approximates Fisher information matrix
basedOn trust region optimization
category deep learning optimization method
second-order reinforcement learning method
comparedWith A2C in original paper
TRPO in original paper
constrains policy update step size via trust region
designedFor deep reinforcement learning
evaluatedOn Atari 2600 benchmark
MuJoCo continuous control tasks
implementedIn TensorFlow in original code release
improves data efficiency compared to first-order methods
stability compared to vanilla policy gradient
introducedBy Elman Mansimov NERFINISHED
Jimmy Ba NERFINISHED
Roger B. Grosse NERFINISHED
Shun Liao NERFINISHED
Yuhuai Wu NERFINISHED
introducedIn paper "Scalable Trust-Region Method for Deep Reinforcement Learning Using Kronecker-Factored Approximation" NERFINISHED
openSource true
optimizes actor network
critic network
publishedAt ICLR 2017 NERFINISHED
relatedTo A2C NERFINISHED
A3C NERFINISHED
TRPO NERFINISHED
Trust Region Policy Optimization NERFINISHED
natural policy gradient
targets maximization of expected return
uses Kronecker-factored approximate curvature
Kronecker-factored approximation of curvature
actor-critic architecture
advantage estimates
mini-batch updates
natural gradient
on-policy learning
second-order optimization information
stochastic gradient estimates

Referenced by (2)

Full triples — surface form annotated when it differs from this entity's canonical label.

ACKTR abbreviationOf Actor-Critic using Kronecker-Factored Trust Region
ACKTR fullName Actor-Critic using Kronecker-Factored Trust Region