HER
E441111
HER is a reinforcement learning technique that improves learning from sparse rewards by reinterpreting failed experiences as successful ones for alternative goals.
Observed surface forms (1)
| Surface form | Occurrences |
|---|---|
| Hindsight Experience Replay | 0 |
Statements (42)
| Predicate | Object |
|---|---|
| instanceOf |
reinforcement learning technique
ⓘ
reinforcement learning technique ⓘ |
| abbreviation | HER NERFINISHED ⓘ |
| addresses | sparse reward problem ⓘ |
| aimsTo |
improve learning stability in sparse reward settings
ⓘ
reduce sample complexity in goal-based tasks ⓘ |
| appliedTo |
goal-conditioned reinforcement learning
ⓘ
multi-goal reinforcement learning ⓘ |
| assumes | environment with goal space ⓘ |
| benefits |
navigation tasks with sparse rewards
ⓘ
robotic manipulation tasks ⓘ |
| category | off-policy data augmentation method ⓘ |
| commonlyCombinedWith |
DDPG
NERFINISHED
ⓘ
Deep Deterministic Policy Gradient NERFINISHED ⓘ |
| coreIdea |
reinterpret failed experiences as successful ones for alternative goals
ⓘ
relabelling goals in past trajectories ⓘ |
| describedIn | paper "Hindsight Experience Replay" ⓘ |
| fullName | Hindsight Experience Replay NERFINISHED ⓘ |
| improves |
learning from sparse rewards
ⓘ
sample efficiency in reinforcement learning ⓘ |
| inspired |
extensions such as CHER and ARCHER
ⓘ
subsequent goal relabelling methods ⓘ |
| introducedBy |
Alex Ray
NERFINISHED
ⓘ
Bob McGrew NERFINISHED ⓘ Filip Wolski NERFINISHED ⓘ Jonas Schneider NERFINISHED ⓘ Josh Tobin NERFINISHED ⓘ Marcin Andrychowicz NERFINISHED ⓘ OpenAI researchers ⓘ Peter Welinder NERFINISHED ⓘ Rachel Fong NERFINISHED ⓘ |
| introducedIn | 2017 ⓘ |
| modifies | stored transitions with alternative goals ⓘ |
| operatesOn | off-policy reinforcement learning algorithms ⓘ |
| publishedAt |
NIPS 2017
NERFINISHED
ⓘ
Neural Information Processing Systems NERFINISHED ⓘ |
| relatedTo |
goal relabelling
ⓘ
universal value function approximators ⓘ |
| requires | goal-conditioned reward function ⓘ |
| typeOf | model-free reinforcement learning enhancement ⓘ |
| usedIn | reinforcement learning ⓘ |
| uses | experience replay buffer ⓘ |
Referenced by (1)
Full triples — surface form annotated when it differs from this entity's canonical label.