Double DQN

E101969

Deep Q-Learning variant model-free algorithm off-policy algorithm reinforcement learning algorithm value-based reinforcement learning method

Double DQN is a reinforcement learning algorithm that improves upon standard Deep Q-Networks by reducing overestimation bias through decoupling action selection from action evaluation.

Try in SPARQL Jump to: Surface forms Disambiguation Statements Elicitation Referenced by

All labels observed (3)

Label	Occurrences
Double DQN canonical	5
Double Deep Q-Network	1
paper "Deep Reinforcement Learning with Double Q-learning"	1

How this entity was disambiguated

This entity first appeared as the object of triple T824083 — resolving that mention is where its identity was fixed. The disambiguator weighed these candidate entities and picked the highlighted one (or “None”, minting a new entity). This is how homonymy is resolved: the same surface form can point to different entities.

NED1 Entity disambiguation (via context triple) gpt-5-mini-2025-08-07

Target entity: Double DQN
Context triple: [OpenAI Baselines, implementsAlgorithm, Double DQN]

A. Dueling DQN
Dueling DQN is a deep reinforcement learning algorithm that separates state-value and advantage estimations within its neural network architecture to improve learning efficiency and stability over standard DQN.
B. Prioritized Experience Replay DQN
Prioritized Experience Replay DQN is a variant of the Deep Q-Network algorithm that improves learning efficiency by sampling more informative experiences with higher priority from the replay buffer.
C. DDPG
DDPG (Deep Deterministic Policy Gradient) is a model-free, off-policy deep reinforcement learning algorithm designed for continuous action spaces, combining ideas from DQN and actor-critic methods.
D. Atari deep Q-network
The Atari deep Q-network is a pioneering deep reinforcement learning system that learned to play a wide range of Atari 2600 video games directly from raw pixels at human-level or better performance.
E. Hindsight Experience Replay
Hindsight Experience Replay is a reinforcement learning technique that improves sample efficiency by reinterpreting failed attempts as successful experiences toward alternative goals.
F. None of above. chosen
G. Unsure - the case is ambiguous/there is not enough information to decide.

NED2 Entity disambiguation (via description) gpt-5-mini-2025-08-07

Target entity: Double DQN
Target entity description: Double DQN is a reinforcement learning algorithm that improves upon standard Deep Q-Networks by reducing overestimation bias through decoupling action selection from action evaluation.

A. Dueling DQN
Dueling DQN is a deep reinforcement learning algorithm that separates state-value and advantage estimations within its neural network architecture to improve learning efficiency and stability over standard DQN.
B. Prioritized Experience Replay DQN
Prioritized Experience Replay DQN is a variant of the Deep Q-Network algorithm that improves learning efficiency by sampling more informative experiences with higher priority from the replay buffer.
C. DDPG
DDPG (Deep Deterministic Policy Gradient) is a model-free, off-policy deep reinforcement learning algorithm designed for continuous action spaces, combining ideas from DQN and actor-critic methods.
D. Atari deep Q-network
The Atari deep Q-network is a pioneering deep reinforcement learning system that learned to play a wide range of Atari 2600 video games directly from raw pixels at human-level or better performance.
E. Hindsight Experience Replay
Hindsight Experience Replay is a reinforcement learning technique that improves sample efficiency by reinterpreting failed attempts as successful experiences toward alternative goals.
F. None of above. chosen

Statements (48)

Predicate	Object
instanceOf	Deep Q-Learning variant ⓘ model-free algorithm ⓘ off-policy algorithm ⓘ reinforcement learning algorithm ⓘ value-based reinforcement learning method ⓘ
addressesProblem	overestimation of action values in DQN ⓘ
alsoKnownAs	Double DQN ⓘ surface form: Double Deep Q-Network
basedOn	Q-learning ⓘ
category	deep reinforcement learning ⓘ
commonlyUses	epsilon-greedy exploration ⓘ experience replay ⓘ target network with delayed updates ⓘ
empiricalResult	often achieves higher scores than DQN on Atari benchmarks ⓘ reduces overestimation of Q-values compared to DQN ⓘ
evaluationDomain	Atari 2600 ⓘ surface form: Atari 2600 games
extends	Deep Q-Learning ⓘ surface form: Deep Q-Network
frameworkSupport	implemented in many deep RL libraries ⓘ
implementationDetail	shares architecture with DQN but changes target calculation ⓘ
improvesUpon	Deep Q-Network performance stability ⓘ Deep Q-Network value estimation accuracy ⓘ
influenced	Dueling DQN ⓘ surface form: Dueling Double DQN Rainbow DQN ⓘ
inspiredBy	Q-learning ⓘ surface form: Double Q-learning
introducedBy	Arthur Guez ⓘ David Silver ⓘ Hado van Hasselt ⓘ
keyIdea	decouple action selection from action evaluation ⓘ
learningType	temporal-difference learning ⓘ
modifies	target value computation of DQN ⓘ
networkType	deep neural network approximator ⓘ
notableProperty	maintains same computational complexity as DQN up to constant factors ⓘ
optimizationMethod	stochastic gradient descent or variants ⓘ
policyType	greedy policy w.r.t. learned Q-values ⓘ
primaryGoal	reduce overestimation bias in Q-learning ⓘ
publicationYear	2015 ⓘ
publishedIn	Double DQN self-linksurface differs ⓘ surface form: paper "Deep Reinforcement Learning with Double Q-learning"
reduces	positive bias in max operator over noisy value estimates ⓘ
requires	discrete action space ⓘ
targetComputation	uses argmax over online network Q-values to select action ⓘ uses target network Q-value of selected action for evaluation ⓘ
trainingMode	batch updates from replay buffer ⓘ
updateRule	uses separate networks in target for action selection and evaluation ⓘ
usedIn	control tasks in simulated environments ⓘ game-playing agents ⓘ
uses	online network for action selection ⓘ target network for action evaluation ⓘ two value estimates for action evaluation ⓘ
valueFunction	approximates state-action value function Q(s,a) ⓘ

How these facts were elicited

Referenced by (7)

Full triples — surface form annotated when it differs from this entity's canonical label.

OpenAI Baselines → implementsAlgorithm → Double DQN ⓘ

Atari deep Q-network → inspiredAlgorithm → Double DQN ⓘ

Dueling DQN → oftenCombinedWith → Double DQN ⓘ

Prioritized Experience Replay DQN → compatibleWith → Double DQN ⓘ

Double DQN → alsoKnownAs → Double DQN ⓘ

this entity surface form: Double Deep Q-Network

Double DQN → publishedIn → Double DQN self-linksurface differs ⓘ

this entity surface form: paper "Deep Reinforcement Learning with Double Q-learning"

Rainbow DQN → improvesOver → Double DQN ⓘ

All labels observed (3)

How this entity was disambiguated Show

Statements (48)

How these facts were elicited Show

Referenced by (7)

How this entity was disambiguated

How these facts were elicited