neural fitted Q-iteration (NFQ)

E736830

Neural Fitted Q-Iteration (NFQ) is a reinforcement learning algorithm that uses neural networks to approximate the Q-function from batches of experience, enabling efficient learning in continuous and high-dimensional state spaces.

Try in SPARQL Jump to: Surface forms Statements Referenced by

Observed surface forms (1)

Surface form Occurrences
Neural Fitted Q-Iteration 0

Statements (41)

Predicate Object
instanceOf Q-learning variant
batch reinforcement learning method
off-policy value-based method
reinforcement learning algorithm
aimsTo handle continuous state spaces
handle high-dimensional state spaces
improve data efficiency in reinforcement learning
approximates Q-function
assumes Markov decision process setting
benefit sample efficiency compared to purely online methods
canBeAppliedTo continuous control problems
robotics
canReuse previously collected experience
category value-based reinforcement learning
contrastWith online Q-learning
enables policy derivation via greedy action selection over Q-values
handles continuous state representations via neural networks
hasAbbreviation NFQ NERFINISHED
inputDataType state-action-reward-next-state tuples
introducedIn early 2000s
isBasedOn fitted Q-iteration framework
isDesignedFor model-free reinforcement learning
learningType off-policy
operatesOn batches of experience
optimizationObjective minimize temporal-difference error over batch
originalApplicationDomain control tasks
output approximate optimal action-value function
relatedTo deep Q-learning
fitted value iteration
requires collected dataset before each training phase
discrete action space in its basic form
supportsStateSpace continuous state spaces GENERATED
high-dimensional state spaces GENERATED
trainingParadigm batch training
updateProcess iteratively refits Q-network to updated targets
updateStyle fitted value iteration
usesFunctionApproximator neural network
usesLossFunction supervised regression loss
usesNetworkType feedforward neural network
usesTarget Bellman optimality equation NERFINISHED
usesTechnique experience replay-like batch reuse

Referenced by (1)

Full triples — surface form annotated when it differs from this entity's canonical label.

Martin Riedmiller notableWork neural fitted Q-iteration (NFQ)