Triple

T17738637
Position Surface form Disambiguated ID Type / Status
Subject Timothy P. Lillicrap E442790 entity
Predicate coAuthorOf P2389 FINISHED
Object Continuous control with deep reinforcement learning NE NERFINISHED

Disambiguation candidates (2 decisions)

The exact options the model was shown at each disambiguation step, with the option it chose highlighted — the evidence behind this triple's disambiguated ids.

NED1 Entity disambiguation (via context triple) gpt-5-mini-2025-08-07
Target entity: Continuous control with deep reinforcement learning
Context triple: [Timothy P. Lillicrap, coAuthorOf, Continuous control with deep reinforcement learning]
  • A. Deep Q-Learning
    Deep Q-Learning is a reinforcement learning algorithm that uses deep neural networks to approximate Q-values, enabling agents to learn effective policies directly from high-dimensional inputs like raw images.
  • B. Natural Policy Gradient
    Natural Policy Gradient is a reinforcement learning optimization method that improves policy gradient updates by accounting for the geometry of the parameter space using the Fisher information matrix, leading to more stable and efficient learning.
  • C. Atari deep Q-network
    The Atari deep Q-network is a pioneering deep reinforcement learning system that learned to play a wide range of Atari 2600 video games directly from raw pixels at human-level or better performance.
  • D. Soft Actor-Critic
    Soft Actor-Critic is a model-free deep reinforcement learning algorithm that combines off-policy learning with entropy maximization to achieve stable and sample-efficient continuous control.
  • E. Proximal Policy Optimization
    Proximal Policy Optimization is a popular reinforcement learning algorithm that improves policy gradient methods by using clipped objective functions to achieve stable and efficient training.
  • F. None of above. chosen
  • G. Unsure - the case is ambiguous/there is not enough information to decide.
NED2 Entity disambiguation (via description) gpt-5-mini-2025-08-07
Target entity: Continuous control with deep reinforcement learning
Target entity description: "Continuous control with deep reinforcement learning" is a highly influential research paper that introduced deep neural network methods for solving continuous-action reinforcement learning tasks, notably using deterministic policy gradients.
  • A. Deep Q-Learning
    Deep Q-Learning is a reinforcement learning algorithm that uses deep neural networks to approximate Q-values, enabling agents to learn effective policies directly from high-dimensional inputs like raw images.
  • B. Natural Policy Gradient
    Natural Policy Gradient is a reinforcement learning optimization method that improves policy gradient updates by accounting for the geometry of the parameter space using the Fisher information matrix, leading to more stable and efficient learning.
  • C. Atari deep Q-network
    The Atari deep Q-network is a pioneering deep reinforcement learning system that learned to play a wide range of Atari 2600 video games directly from raw pixels at human-level or better performance.
  • D. Soft Actor-Critic
    Soft Actor-Critic is a model-free deep reinforcement learning algorithm that combines off-policy learning with entropy maximization to achieve stable and sample-efficient continuous control.
  • E. Proximal Policy Optimization
    Proximal Policy Optimization is a popular reinforcement learning algorithm that improves policy gradient methods by using clipped objective functions to achieve stable and efficient training.
  • F. None of above. chosen

Provenance (2 batches)

Stage Batch ID Job type Status
creating batch_69d8b9ed3a2081909b2ec0d4dd2f4c37 elicitation completed
NER batch_69e47acb05848190a4b7edb98f15b8c6 ner completed
Created at: April 10, 2026, 10:09 a.m.