neural fitted Q-iteration (NFQ)
E736830
Q-learning variant
batch reinforcement learning method
off-policy value-based method
reinforcement learning algorithm
Neural Fitted Q-Iteration (NFQ) is a reinforcement learning algorithm that uses neural networks to approximate the Q-function from batches of experience, enabling efficient learning in continuous and high-dimensional state spaces.
Observed surface forms (1)
| Surface form | Occurrences |
|---|---|
| Neural Fitted Q-Iteration | 0 |
Statements (41)
| Predicate | Object |
|---|---|
| instanceOf |
Q-learning variant
ⓘ
batch reinforcement learning method ⓘ off-policy value-based method ⓘ reinforcement learning algorithm ⓘ |
| aimsTo |
handle continuous state spaces
ⓘ
handle high-dimensional state spaces ⓘ improve data efficiency in reinforcement learning ⓘ |
| approximates | Q-function ⓘ |
| assumes | Markov decision process setting ⓘ |
| benefit | sample efficiency compared to purely online methods ⓘ |
| canBeAppliedTo |
continuous control problems
ⓘ
robotics ⓘ |
| canReuse | previously collected experience ⓘ |
| category | value-based reinforcement learning ⓘ |
| contrastWith | online Q-learning ⓘ |
| enables | policy derivation via greedy action selection over Q-values ⓘ |
| handles | continuous state representations via neural networks ⓘ |
| hasAbbreviation | NFQ NERFINISHED ⓘ |
| inputDataType | state-action-reward-next-state tuples ⓘ |
| introducedIn | early 2000s ⓘ |
| isBasedOn | fitted Q-iteration framework ⓘ |
| isDesignedFor | model-free reinforcement learning ⓘ |
| learningType | off-policy ⓘ |
| operatesOn | batches of experience ⓘ |
| optimizationObjective | minimize temporal-difference error over batch ⓘ |
| originalApplicationDomain | control tasks ⓘ |
| output | approximate optimal action-value function ⓘ |
| relatedTo |
deep Q-learning
ⓘ
fitted value iteration ⓘ |
| requires |
collected dataset before each training phase
ⓘ
discrete action space in its basic form ⓘ |
| supportsStateSpace |
continuous state spaces
GENERATED
ⓘ
high-dimensional state spaces GENERATED ⓘ |
| trainingParadigm | batch training ⓘ |
| updateProcess | iteratively refits Q-network to updated targets ⓘ |
| updateStyle | fitted value iteration ⓘ |
| usesFunctionApproximator | neural network ⓘ |
| usesLossFunction | supervised regression loss ⓘ |
| usesNetworkType | feedforward neural network ⓘ |
| usesTarget | Bellman optimality equation NERFINISHED ⓘ |
| usesTechnique | experience replay-like batch reuse ⓘ |
Referenced by (1)
Full triples — surface form annotated when it differs from this entity's canonical label.