Universal Value Function Approximators

E441116

Universal Value Function Approximators (UVFA) are a reinforcement learning framework that generalizes value functions over both states and goals, enabling agents to learn goal-conditioned behaviors in a unified way.

Try in SPARQL Jump to: Statements Referenced by

Statements (46)

Predicate Object
instanceOf goal-conditioned value function model
reinforcement learning framework
abbreviation UVFA
addresses lack of generalization across goals in standard value functions
approximatorType parametric function approximator
assumes shared structure across goals
citationVenue Proceedings of the 32nd International Conference on Machine Learning NERFINISHED
commonImplementation neural network
compatibleWith Q-learning NERFINISHED
actor-critic methods
policy gradient methods
coreIdea generalize value functions over both states and goals
represent value as a function of state and goal
enables generalization to unseen goals
goal-conditioned policies
multi-goal reinforcement learning
transfer across goals
evaluationDomain grid-world tasks
navigation tasks
evaluationMetric performance on multiple goals
field machine learning
reinforcement learning
formalization V(s,g) as value function over state s and goal g
goalRepresentation can be continuous
can be discrete
inputIncludes goal representation
state representation
inspired universal function approximation in supervised learning
introducedBy Daniel Horgan NERFINISHED
David Silver NERFINISHED
Karol Gregor NERFINISHED
Tom Schaul NERFINISHED
learningSignal temporal-difference error
organization DeepMind NERFINISHED
outputRepresents expected return for given state and goal
publicationTitle Universal Value Function Approximators NERFINISHED
publicationYear 2015
publishedIn ICML 2015 NERFINISHED
International Conference on Machine Learning NERFINISHED
relatedTo goal-conditioned reinforcement learning
hindsight experience replay
successor features
universal policy approximators
usedFor generalization over goal space
multi-task learning in reinforcement learning
transfer learning in reinforcement learning

Referenced by (1)

Full triples — surface form annotated when it differs from this entity's canonical label.

Hindsight Experience Replay relatedTo Universal Value Function Approximators