ACKTR
E98477
ACKTR (Actor-Critic using Kronecker-Factored Trust Region) is a reinforcement learning algorithm that combines actor-critic methods with efficient second-order optimization via Kronecker-factored approximations to improve training stability and sample efficiency.
All labels observed (1)
| Label | Occurrences |
|---|---|
| ACKTR canonical | 3 |
Statements (37)
| Predicate | Object |
|---|---|
| instanceOf |
actor-critic algorithm
ⓘ
reinforcement learning algorithm ⓘ |
| abbreviationOf | Actor-Critic using Kronecker-Factored Trust Region ⓘ |
| aimsToImprove |
sample efficiency
ⓘ
training stability ⓘ |
| approximates | natural gradient ⓘ |
| basedOn |
actor-critic framework
ⓘ
trust region optimization ⓘ |
| category |
policy gradient method
ⓘ
value-based method ⓘ |
| combines |
policy gradient learning
ⓘ
value function estimation ⓘ |
| comparedWith |
A2C
ⓘ
A3C ⓘ PPO ⓘ TRPO ⓘ |
| designedFor |
policy optimization
ⓘ
value function learning ⓘ |
| field | deep reinforcement learning ⓘ |
| fullName | Actor-Critic using Kronecker-Factored Trust Region ⓘ |
| hasProperty |
on-policy
ⓘ
sample efficient ⓘ stable training dynamics ⓘ |
| introducedAs | efficient natural gradient actor-critic method ⓘ |
| objective | maximize expected cumulative reward ⓘ |
| optimizationType | second-order method ⓘ |
| usedIn |
Atari benchmarks
ⓘ
control tasks ⓘ |
| usesApproximation | Kronecker-factored curvature matrix ⓘ |
| usesComponent |
actor network
ⓘ
critic network ⓘ |
| usesGradientInformation | curvature-aware updates ⓘ |
| usesNaturalGradient | true ⓘ |
| usesNeuralNetworks | true ⓘ |
| usesOptimizationMethod |
Kronecker-factored approximation
ⓘ
second-order optimization ⓘ |
| usesTrustRegion | true ⓘ |
Referenced by (3)
Full triples — surface form annotated when it differs from this entity's canonical label.