neural fitted Q-iteration (NFQ)

E736830

Q-learning variant batch reinforcement learning method off-policy value-based method reinforcement learning algorithm

Neural Fitted Q-Iteration (NFQ) is a reinforcement learning algorithm that uses neural networks to approximate the Q-function from batches of experience, enabling efficient learning in continuous and high-dimensional state spaces.

Try in SPARQL Jump to: Surface forms Statements Referenced by

Observed surface forms (1)

Surface form	Occurrences
Neural Fitted Q-Iteration	0

Statements (41)

Predicate	Object
instanceOf	Q-learning variant ⓘ batch reinforcement learning method ⓘ off-policy value-based method ⓘ reinforcement learning algorithm ⓘ
aimsTo	handle continuous state spaces ⓘ handle high-dimensional state spaces ⓘ improve data efficiency in reinforcement learning ⓘ
approximates	Q-function ⓘ
assumes	Markov decision process setting ⓘ
benefit	sample efficiency compared to purely online methods ⓘ
canBeAppliedTo	continuous control problems ⓘ robotics ⓘ
canReuse	previously collected experience ⓘ
category	value-based reinforcement learning ⓘ
contrastWith	online Q-learning ⓘ
enables	policy derivation via greedy action selection over Q-values ⓘ
handles	continuous state representations via neural networks ⓘ
hasAbbreviation	NFQ NERFINISHED ⓘ
inputDataType	state-action-reward-next-state tuples ⓘ
introducedIn	early 2000s ⓘ
isBasedOn	fitted Q-iteration framework ⓘ
isDesignedFor	model-free reinforcement learning ⓘ
learningType	off-policy ⓘ
operatesOn	batches of experience ⓘ
optimizationObjective	minimize temporal-difference error over batch ⓘ
originalApplicationDomain	control tasks ⓘ
output	approximate optimal action-value function ⓘ
relatedTo	deep Q-learning ⓘ fitted value iteration ⓘ
requires	collected dataset before each training phase ⓘ discrete action space in its basic form ⓘ
supportsStateSpace	continuous state spaces GENERATED ⓘ high-dimensional state spaces GENERATED ⓘ
trainingParadigm	batch training ⓘ
updateProcess	iteratively refits Q-network to updated targets ⓘ
updateStyle	fitted value iteration ⓘ
usesFunctionApproximator	neural network ⓘ
usesLossFunction	supervised regression loss ⓘ
usesNetworkType	feedforward neural network ⓘ
usesTarget	Bellman optimality equation NERFINISHED ⓘ
usesTechnique	experience replay-like batch reuse ⓘ

Referenced by (1)

Full triples — surface form annotated when it differs from this entity's canonical label.

Martin Riedmiller → notableWork → neural fitted Q-iteration (NFQ) ⓘ