AlphaZero

E40166

AlphaZero is a DeepMind-developed artificial intelligence system that mastered complex games like chess, shogi, and Go through self-play reinforcement learning without human-crafted strategies.


Statements (53)
Predicate Object
instanceOf artificial intelligence system
game‑playing program
architectureType deep neural network with Monte Carlo tree search
basedOn Monte Carlo tree search
deep learning
reinforcement learning
contrastWith programs relying on human expert knowledge
traditional chess engines using alpha‑beta search
countryOfOrigin United Kingdom
creatorOrganizationType AI research lab
defeated Elmo shogi engine
Stockfish 8
previous Go programs based on AlphaGo Zero
designedFor Go
chess
shogi
developer DeepMind
Google DeepMind
doesNotUse endgame tablebases for search guidance
human‑crafted opening books
evaluationFunction learned value function
field artificial intelligence
computer Go
computer chess
computer shogi
machine learning
firstPublicAnnouncementDate 2017-12-06
firstPublicAnnouncementYear 2017
gameRepresentation board positions encoded for neural networks
generalizationProperty single algorithm applied to multiple games
hardwareUsed TPUs
learningObjective maximize expected game outcome
learningParadigm tabula rasa learning
notableFor mastering Go through self‑play
mastering chess through self‑play
mastering shogi through self‑play
outperforms AlphaGo Zero
Elmo
Stockfish
parentProject AlphaGo project
policyRepresentation probability distribution over moves
publicationTitle A general reinforcement learning algorithm that masters chess, shogi, and Go through self‑play
publishedIn Science
rewardSignal game result win‑draw‑loss
searchGuidance policy network priors
value network evaluations
searchTechnique Monte Carlo tree search guided by neural networks
trainingDataSource self‑generated game data
trainingMethod self‑play
trainingRegime self‑play reinforcement learning without human examples
uses neural networks
policy network
value network

Referenced by (15)
Subject (surface form when different) Predicate
David Silver
DeepMind
Demis Hassabis
knownFor
DeepMind ("AlphaGo Zero")
DeepMind
developed
AlphaGo ("AlphaGo Zero")
AlphaGo
inspired
AlphaGo ("AlphaGo Zero")
AlphaGo
successor
MuZero
comparedTo
MuZero
inspiredBy
David Silver ("Mastering chess and shogi by self-play with a general reinforcement learning algorithm")
notablePaper
David Silver
notableWork
AlphaZero ("A general reinforcement learning algorithm that masters chess, shogi, and Go through self‑play")
publicationTitle
AlphaStar
relatedTo

Please wait…