Rainbow DQN
E200562
Rainbow DQN is a deep reinforcement learning algorithm that combines several key extensions to the original DQN—such as double Q-learning, prioritized replay, dueling networks, multi-step learning, distributional RL, and noisy nets—into a single, more performant agent.
All labels observed (3)
| Label | Occurrences |
|---|---|
| Rainbow DQN canonical | 3 |
| C51 distributional DQN | 1 |
| Rainbow: Combining Improvements in Deep Reinforcement Learning | 1 |
Statements (49)
| Predicate | Object |
|---|---|
| instanceOf |
DQN extension
ⓘ
deep reinforcement learning algorithm ⓘ value-based reinforcement learning method ⓘ |
| actionSpace | discrete action spaces ⓘ |
| basedOn |
Deep Q-Learning
ⓘ
surface form:
Deep Q-Network
|
| citationYear | 2018 ⓘ |
| codeAvailability | open-source implementations in multiple frameworks (e.g., PyTorch, TensorFlow) ⓘ |
| combines |
Double Q-learning
ⓘ
distributional reinforcement learning ⓘ dueling network architecture ⓘ multi-step learning ⓘ noisy networks for exploration ⓘ prioritized experience replay ⓘ |
| developedAt | DeepMind ⓘ |
| evaluatedOn | Atari 2600 games from the Arcade Learning Environment ⓘ |
| goal | maximize expected cumulative reward ⓘ |
| improvesOver |
Rainbow DQN
self-linksurface differs
ⓘ
surface form:
C51 distributional DQN
Deep Q-Learning ⓘ
surface form:
DQN
Double DQN ⓘ Dueling DQN ⓘ Prioritized Experience Replay DQN ⓘ
surface form:
Prioritized DQN
|
| influenced | subsequent Atari benchmark baselines ⓘ |
| introducedInPaper |
Rainbow DQN
self-linksurface differs
ⓘ
surface form:
Rainbow: Combining Improvements in Deep Reinforcement Learning
|
| learningSignal | temporal-difference error ⓘ |
| optimizationAlgorithm | stochastic gradient descent variant ⓘ |
| outperforms | baseline DQN on Atari benchmarks ⓘ |
| proposedBy |
Bilal Piot
ⓘ
Dan Horgan ⓘ David Silver ⓘ Georg Ostrovski ⓘ Hado van Hasselt ⓘ Joseph Modayil ⓘ Matteo Hessel ⓘ Mohammad Azar ⓘ Tom Schaul ⓘ Will Dabney ⓘ |
| publishedAt |
AAAI Conference on Artificial Intelligence
ⓘ
surface form:
AAAI 2018
|
| taskType | model-free reinforcement learning ⓘ |
| uses |
Q-learning update rule
ⓘ
epsilon-greedy policy with noisy nets modification ⓘ experience replay buffer ⓘ target network ⓘ |
| usesDistributionalMethod | categorical value distribution (C51) ⓘ |
| usesExplorationMethod | noisy linear layers ⓘ |
| usesFunctionApproximator | deep convolutional neural network ⓘ |
| usesMultiStepReturn | n-step returns ⓘ |
| usesNetworkArchitecture | dueling network with value and advantage streams ⓘ |
| usesPrioritization | proportional prioritized replay ⓘ |
| valueRepresentation | distribution over returns ⓘ |
Referenced by (5)
Full triples — surface form annotated when it differs from this entity's canonical label.
this entity surface form:
Rainbow: Combining Improvements in Deep Reinforcement Learning
this entity surface form:
C51 distributional DQN