Universal Value Function Approximators
E441116
Universal Value Function Approximators (UVFA) are a reinforcement learning framework that generalizes value functions over both states and goals, enabling agents to learn goal-conditioned behaviors in a unified way.
Statements (46)
| Predicate | Object |
|---|---|
| instanceOf |
goal-conditioned value function model
ⓘ
reinforcement learning framework ⓘ |
| abbreviation | UVFA ⓘ |
| addresses | lack of generalization across goals in standard value functions ⓘ |
| approximatorType | parametric function approximator ⓘ |
| assumes | shared structure across goals ⓘ |
| citationVenue | Proceedings of the 32nd International Conference on Machine Learning NERFINISHED ⓘ |
| commonImplementation | neural network ⓘ |
| compatibleWith |
Q-learning
NERFINISHED
ⓘ
actor-critic methods ⓘ policy gradient methods ⓘ |
| coreIdea |
generalize value functions over both states and goals
ⓘ
represent value as a function of state and goal ⓘ |
| enables |
generalization to unseen goals
ⓘ
goal-conditioned policies ⓘ multi-goal reinforcement learning ⓘ transfer across goals ⓘ |
| evaluationDomain |
grid-world tasks
ⓘ
navigation tasks ⓘ |
| evaluationMetric | performance on multiple goals ⓘ |
| field |
machine learning
ⓘ
reinforcement learning ⓘ |
| formalization | V(s,g) as value function over state s and goal g ⓘ |
| goalRepresentation |
can be continuous
ⓘ
can be discrete ⓘ |
| inputIncludes |
goal representation
ⓘ
state representation ⓘ |
| inspired | universal function approximation in supervised learning ⓘ |
| introducedBy |
Daniel Horgan
NERFINISHED
ⓘ
David Silver NERFINISHED ⓘ Karol Gregor NERFINISHED ⓘ Tom Schaul NERFINISHED ⓘ |
| learningSignal | temporal-difference error ⓘ |
| organization | DeepMind NERFINISHED ⓘ |
| outputRepresents | expected return for given state and goal ⓘ |
| publicationTitle | Universal Value Function Approximators NERFINISHED ⓘ |
| publicationYear | 2015 ⓘ |
| publishedIn |
ICML 2015
NERFINISHED
ⓘ
International Conference on Machine Learning NERFINISHED ⓘ |
| relatedTo |
goal-conditioned reinforcement learning
ⓘ
hindsight experience replay ⓘ successor features ⓘ universal policy approximators ⓘ |
| usedFor |
generalization over goal space
ⓘ
multi-task learning in reinforcement learning ⓘ transfer learning in reinforcement learning ⓘ |
Referenced by (1)
Full triples — surface form annotated when it differs from this entity's canonical label.