Hado van Hasselt
E441099
Hado van Hasselt is a researcher in reinforcement learning best known for pioneering methods such as Double Q-learning and Dueling DQN that address overestimation bias and improve deep RL performance.
All labels observed (1)
| Label | Occurrences |
|---|---|
| Hado van Hasselt canonical | 3 |
Statements (46)
| Predicate | Object |
|---|---|
| instanceOf |
artificial intelligence researcher
ⓘ
computer scientist ⓘ person ⓘ reinforcement learning researcher ⓘ |
| citizenship | Netherlands NERFINISHED ⓘ |
| coDeveloped | Dueling DQN architecture NERFINISHED ⓘ |
| developed |
Double DQN algorithm
NERFINISHED
ⓘ
Double Q-learning algorithm NERFINISHED ⓘ |
| educatedAt | Utrecht University NERFINISHED ⓘ |
| employer | Google DeepMind NERFINISHED ⓘ |
| fieldOfWork |
artificial intelligence
ⓘ
machine learning ⓘ reinforcement learning ⓘ |
| hasAcademicDegree | PhD in computer science ⓘ |
| hasPublicationType |
conference papers
ⓘ
journal articles ⓘ |
| hasRole | senior research scientist at DeepMind ⓘ |
| influencedBy |
Q-learning
NERFINISHED
ⓘ
temporal-difference learning methods ⓘ |
| knownFor |
Double DQN
NERFINISHED
ⓘ
Double Q-learning NERFINISHED ⓘ Dueling DQN NERFINISHED ⓘ deep reinforcement learning algorithms ⓘ reducing overestimation bias in Q-learning ⓘ |
| languageSpoken |
Dutch
ⓘ
English ⓘ |
| memberOf | DeepMind NERFINISHED ⓘ |
| nationality | Dutch ⓘ |
| notableWork |
Deep Reinforcement Learning with Double Q-learning
NERFINISHED
ⓘ
Double Q-learning: Mitigating the overestimation bias in Q-learning NERFINISHED ⓘ Dueling Network Architectures for Deep Reinforcement Learning NERFINISHED ⓘ |
| occupation |
research scientist
ⓘ
researcher ⓘ |
| publishedIn |
AAAI
NERFINISHED
ⓘ
ICML NERFINISHED ⓘ JMLR NERFINISHED ⓘ NeurIPS NERFINISHED ⓘ |
| researchInterest |
deep reinforcement learning
ⓘ
exploration in reinforcement learning ⓘ off-policy learning ⓘ temporal-difference learning ⓘ value-based reinforcement learning ⓘ |
| thesisTopic | reinforcement learning ⓘ |
| worksOn |
applications of RL to games
ⓘ
scalable deep RL algorithms ⓘ stability and bias in value estimation ⓘ |
Referenced by (3)
Full triples — surface form annotated when it differs from this entity's canonical label.