Double DQN
E101969
Double DQN is a reinforcement learning algorithm that improves upon standard Deep Q-Networks by reducing overestimation bias through decoupling action selection from action evaluation.
All labels observed (3)
| Label | Occurrences |
|---|---|
| Double DQN canonical | 5 |
| Double Deep Q-Network | 1 |
| paper "Deep Reinforcement Learning with Double Q-learning" | 1 |
How this entity was disambiguated
This entity first appeared as the object of triple T824083 — resolving that mention is where its identity was fixed. The disambiguator weighed these candidate entities and picked the highlighted one (or “None”, minting a new entity). This is how homonymy is resolved: the same surface form can point to different entities.
Target entity: Double DQN Context triple: [OpenAI Baselines, implementsAlgorithm, Double DQN]
-
A.
Dueling DQN
Dueling DQN is a deep reinforcement learning algorithm that separates state-value and advantage estimations within its neural network architecture to improve learning efficiency and stability over standard DQN.
-
B.
Prioritized Experience Replay DQN
Prioritized Experience Replay DQN is a variant of the Deep Q-Network algorithm that improves learning efficiency by sampling more informative experiences with higher priority from the replay buffer.
-
C.
DDPG
DDPG (Deep Deterministic Policy Gradient) is a model-free, off-policy deep reinforcement learning algorithm designed for continuous action spaces, combining ideas from DQN and actor-critic methods.
-
D.
Atari deep Q-network
The Atari deep Q-network is a pioneering deep reinforcement learning system that learned to play a wide range of Atari 2600 video games directly from raw pixels at human-level or better performance.
-
E.
Hindsight Experience Replay
Hindsight Experience Replay is a reinforcement learning technique that improves sample efficiency by reinterpreting failed attempts as successful experiences toward alternative goals.
- F. None of above. chosen
- G. Unsure - the case is ambiguous/there is not enough information to decide.
Target entity: Double DQN Target entity description: Double DQN is a reinforcement learning algorithm that improves upon standard Deep Q-Networks by reducing overestimation bias through decoupling action selection from action evaluation.
-
A.
Dueling DQN
Dueling DQN is a deep reinforcement learning algorithm that separates state-value and advantage estimations within its neural network architecture to improve learning efficiency and stability over standard DQN.
-
B.
Prioritized Experience Replay DQN
Prioritized Experience Replay DQN is a variant of the Deep Q-Network algorithm that improves learning efficiency by sampling more informative experiences with higher priority from the replay buffer.
-
C.
DDPG
DDPG (Deep Deterministic Policy Gradient) is a model-free, off-policy deep reinforcement learning algorithm designed for continuous action spaces, combining ideas from DQN and actor-critic methods.
-
D.
Atari deep Q-network
The Atari deep Q-network is a pioneering deep reinforcement learning system that learned to play a wide range of Atari 2600 video games directly from raw pixels at human-level or better performance.
-
E.
Hindsight Experience Replay
Hindsight Experience Replay is a reinforcement learning technique that improves sample efficiency by reinterpreting failed attempts as successful experiences toward alternative goals.
- F. None of above. chosen
Statements (48)
| Predicate | Object |
|---|---|
| instanceOf |
Deep Q-Learning variant
ⓘ
model-free algorithm ⓘ off-policy algorithm ⓘ reinforcement learning algorithm ⓘ value-based reinforcement learning method ⓘ |
| addressesProblem | overestimation of action values in DQN ⓘ |
| alsoKnownAs |
Double DQN
ⓘ
surface form:
Double Deep Q-Network
|
| basedOn | Q-learning ⓘ |
| category | deep reinforcement learning ⓘ |
| commonlyUses |
epsilon-greedy exploration
ⓘ
experience replay ⓘ target network with delayed updates ⓘ |
| empiricalResult |
often achieves higher scores than DQN on Atari benchmarks
ⓘ
reduces overestimation of Q-values compared to DQN ⓘ |
| evaluationDomain |
Atari 2600
ⓘ
surface form:
Atari 2600 games
|
| extends |
Deep Q-Learning
ⓘ
surface form:
Deep Q-Network
|
| frameworkSupport | implemented in many deep RL libraries ⓘ |
| implementationDetail | shares architecture with DQN but changes target calculation ⓘ |
| improvesUpon |
Deep Q-Network performance stability
ⓘ
Deep Q-Network value estimation accuracy ⓘ |
| influenced |
Dueling DQN
ⓘ
surface form:
Dueling Double DQN
Rainbow DQN ⓘ |
| inspiredBy |
Q-learning
ⓘ
surface form:
Double Q-learning
|
| introducedBy |
Arthur Guez
ⓘ
David Silver ⓘ Hado van Hasselt ⓘ |
| keyIdea | decouple action selection from action evaluation ⓘ |
| learningType | temporal-difference learning ⓘ |
| modifies | target value computation of DQN ⓘ |
| networkType | deep neural network approximator ⓘ |
| notableProperty | maintains same computational complexity as DQN up to constant factors ⓘ |
| optimizationMethod | stochastic gradient descent or variants ⓘ |
| policyType | greedy policy w.r.t. learned Q-values ⓘ |
| primaryGoal | reduce overestimation bias in Q-learning ⓘ |
| publicationYear | 2015 ⓘ |
| publishedIn |
Double DQN
self-linksurface differs
ⓘ
surface form:
paper "Deep Reinforcement Learning with Double Q-learning"
|
| reduces | positive bias in max operator over noisy value estimates ⓘ |
| requires | discrete action space ⓘ |
| targetComputation |
uses argmax over online network Q-values to select action
ⓘ
uses target network Q-value of selected action for evaluation ⓘ |
| trainingMode | batch updates from replay buffer ⓘ |
| updateRule | uses separate networks in target for action selection and evaluation ⓘ |
| usedIn |
control tasks in simulated environments
ⓘ
game-playing agents ⓘ |
| uses |
online network for action selection
ⓘ
target network for action evaluation ⓘ two value estimates for action evaluation ⓘ |
| valueFunction | approximates state-action value function Q(s,a) ⓘ |
How these facts were elicited
The pipeline generated the facts above by prompting gpt-5.1 with this entity's name + description and the instruction below.
You are a knowledge base construction expert. Given a subject entity and a description of it, return factual statements that you know for the subject as a JSON list of dictionaries(triples), where keys must be "subject", "predicate" and "object". The number of facts may be very high, between 25 to 50 or more, for very popular subjects. For less popular subjects, the number of facts can be very low, like 5 or 10. # Requirements - If you don't know the subject at all, return an empty list. - If the subject is not a named entity, return an empty list. - Include at least one triple where predicate is "instanceOf". - Do not get too wordy. - Separate several objects into multiple triples with one object.
Subject: Double DQN Description of subject: Double DQN is a reinforcement learning algorithm that improves upon standard Deep Q-Networks by reducing overestimation bias through decoupling action selection from action evaluation.
Referenced by (7)
Full triples — surface form annotated when it differs from this entity's canonical label.