Deep Q-Learning
E444494
model-free reinforcement learning method
off-policy learning algorithm
reinforcement learning algorithm
value-based reinforcement learning method
Deep Q-Learning is a reinforcement learning algorithm that uses deep neural networks to approximate Q-values, enabling agents to learn effective policies directly from high-dimensional inputs like raw images.
Observed surface forms (4)
| Surface form | Occurrences |
|---|---|
| Deep Q-Network | 2 |
| DQN | 1 |
| DQN algorithm | 1 |
| Deep Recurrent Q-Learning | 1 |
Statements (47)
| Predicate | Object |
|---|---|
| instanceOf |
model-free reinforcement learning method
ⓘ
off-policy learning algorithm ⓘ reinforcement learning algorithm ⓘ value-based reinforcement learning method ⓘ |
| addresses |
correlated training samples
ⓘ
instability in Q-Learning with function approximation ⓘ non-stationary target values ⓘ |
| approximates | Q-values ⓘ |
| assumes | discrete action space ⓘ |
| basedOn | Q-Learning ⓘ |
| belongsTo | deep reinforcement learning ⓘ |
| canSufferFrom | overestimation bias ⓘ |
| enables |
learning from high-dimensional inputs
ⓘ
learning from raw images ⓘ |
| estimates | action-value function ⓘ |
| inspired |
Double DQN
NERFINISHED
ⓘ
Dueling DQN NERFINISHED ⓘ Prioritized Experience Replay NERFINISHED ⓘ Rainbow DQN NERFINISHED ⓘ |
| isImplementedIn |
Keras
NERFINISHED
ⓘ
PyTorch NERFINISHED ⓘ TensorFlow NERFINISHED ⓘ |
| isNotSuitableFor | large continuous action spaces without modification ⓘ |
| isTaughtIn |
deep learning courses
ⓘ
reinforcement learning courses ⓘ |
| isUsedFor |
Atari 2600 game playing
ⓘ
control tasks ⓘ robotics tasks ⓘ |
| learns | policy implicitly via Q-function ⓘ |
| maps | states to action values ⓘ |
| optimizes | expected cumulative reward ⓘ |
| requires |
interaction with environment
ⓘ
reward signal ⓘ |
| typicallyUses | convolutional neural networks ⓘ |
| updates | neural network parameters ⓘ |
| uses |
Bellman equation
NERFINISHED
ⓘ
deep neural networks ⓘ epsilon-greedy exploration ⓘ experience replay ⓘ function approximation ⓘ replay buffer ⓘ stochastic gradient descent ⓘ target network ⓘ |
| wasDescribedIn | Playing Atari with Deep Reinforcement Learning NERFINISHED ⓘ |
| wasExtendedIn | Human-level control through deep reinforcement learning NERFINISHED ⓘ |
| wasIntroducedIn | 2013 ⓘ |
| wasPopularizedBy | DeepMind NERFINISHED ⓘ |
Referenced by (6)
Full triples — surface form annotated when it differs from this entity's canonical label.
this entity surface form:
Deep Q-Network
this entity surface form:
DQN algorithm
this entity surface form:
Deep Recurrent Q-Learning
this entity surface form:
Deep Q-Network
this entity surface form:
DQN