MuZero

E42386

DeepMind algorithm model-based reinforcement learning algorithm reinforcement learning algorithm

MuZero is a DeepMind reinforcement learning algorithm that learns to plan and master complex games like Go, chess, and Atari without being given the rules in advance.

Jump to: Surface forms Statements Referenced by

Observed surface forms (3)

Surface form	Occurrences
Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model	1
Mastering Atari, Go, chess and shogi by planning with a learned model	1
Mastering the game of Go without human knowledge	1

Statements (48)

Predicate	Object
instanceOf	DeepMind algorithm ⓘ model-based reinforcement learning algorithm ⓘ reinforcement learning algorithm ⓘ
achieves	superhuman performance in Go ⓘ superhuman performance in chess ⓘ superhuman performance in shogi ⓘ
architectureComponent	dynamics function ⓘ prediction function ⓘ representation function ⓘ
basedOn	Monte Carlo tree search ⓘ surface form: Monte Carlo Tree Search deep neural networks ⓘ model-based planning ⓘ
canPlay	Atari 2600 games ⓘ Go ⓘ chess ⓘ shogi ⓘ
category	game-playing AI system ⓘ planning algorithm ⓘ
comparedTo	AlphaZero ⓘ
countryOfOrigin	United Kingdom ⓘ
developer	DeepMind ⓘ
differenceFromAlphaZero	does not require known game rules for planning ⓘ
field	artificial intelligence ⓘ machine learning ⓘ reinforcement learning ⓘ
handles	discrete action spaces ⓘ
inputType	raw observations such as images ⓘ
inspiredBy	AlphaGo ⓘ AlphaGo Zero ⓘ AlphaZero ⓘ
keyFeature	does not require prior knowledge of game rules ⓘ learns environment dynamics from data ⓘ plans using a learned model ⓘ searches in latent state space ⓘ uses value, policy, and reward prediction ⓘ
learningSignal	game outcomes ⓘ
notableFor	planning with a learned model without access to true environment dynamics ⓘ state-of-the-art performance on Atari benchmark at time of publication ⓘ
optimizationObjective	maximize expected cumulative reward ⓘ
organization	DeepMind ⓘ surface form: Google DeepMind
outperforms	prior model-free algorithms on Atari ⓘ
publicationYear	2019 ⓘ
publishedIn	Nature ⓘ
titleOfPaper	MuZero self-linksurface differs ⓘ surface form: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
trainingMethod	reinforcement learning ⓘ self-play ⓘ
uses	gradient-based optimization ⓘ
usesAlgorithm	Monte Carlo tree search ⓘ surface form: Monte Carlo Tree Search

Referenced by (8)

Full triples — surface form annotated when it differs from this entity's canonical label.

DeepMind → developed → MuZero ⓘ

AlphaGo → inspired → MuZero ⓘ

David Silver → knownFor → MuZero ⓘ

David Silver → notablePaper → MuZero ⓘ

this entity surface form: Mastering the game of Go without human knowledge

David Silver → notablePaper → MuZero ⓘ

this entity surface form: Mastering Atari, Go, chess and shogi by planning with a learned model

David Silver → notableWork → MuZero ⓘ

AlphaGo → successor → MuZero ⓘ

MuZero → titleOfPaper → MuZero self-linksurface differs ⓘ

this entity surface form: Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model