AlphaZero
E40166
AlphaZero is a DeepMind-developed artificial intelligence system that mastered complex games like chess, shogi, and Go through self-play reinforcement learning without human-crafted strategies.
All labels observed (6)
Statements (53)
| Predicate | Object |
|---|---|
| instanceOf |
artificial intelligence system
ⓘ
game‑playing program ⓘ |
| architectureType | deep neural network with Monte Carlo tree search ⓘ |
| basedOn |
Monte Carlo tree search
ⓘ
deep learning ⓘ reinforcement learning ⓘ |
| contrastWith |
programs relying on human expert knowledge
ⓘ
traditional chess engines using alpha‑beta search ⓘ |
| countryOfOrigin | United Kingdom ⓘ |
| creatorOrganizationType | AI research lab ⓘ |
| defeated |
Elmo shogi engine
ⓘ
Stockfish ⓘ
surface form:
Stockfish 8
previous Go programs based on AlphaGo Zero ⓘ |
| designedFor |
Go
ⓘ
chess ⓘ shogi ⓘ |
| developer |
DeepMind
ⓘ
DeepMind ⓘ
surface form:
Google DeepMind
|
| doesNotUse |
endgame tablebases for search guidance
ⓘ
human‑crafted opening books ⓘ |
| evaluationFunction | learned value function ⓘ |
| field |
artificial intelligence
ⓘ
computer Go ⓘ computer chess ⓘ computer shogi ⓘ machine learning ⓘ |
| firstPublicAnnouncementDate | 2017-12-06 ⓘ |
| firstPublicAnnouncementYear | 2017 ⓘ |
| gameRepresentation | board positions encoded for neural networks ⓘ |
| generalizationProperty | single algorithm applied to multiple games ⓘ |
| hardwareUsed |
TPUs (via XLA integrations)
ⓘ
surface form:
TPUs
|
| learningObjective | maximize expected game outcome ⓘ |
| learningParadigm | tabula rasa learning ⓘ |
| notableFor |
mastering Go through self‑play
ⓘ
mastering chess through self‑play ⓘ mastering shogi through self‑play ⓘ |
| outperforms |
AlphaGo Zero
ⓘ
Elmo ⓘ Stockfish ⓘ |
| parentProject |
AlphaGo
ⓘ
surface form:
AlphaGo project
|
| policyRepresentation | probability distribution over moves ⓘ |
| publicationTitle |
AlphaZero
self-linksurface differs
ⓘ
surface form:
A general reinforcement learning algorithm that masters chess, shogi, and Go through self‑play
|
| publishedIn | Science ⓘ |
| rewardSignal | game result win‑draw‑loss ⓘ |
| searchGuidance |
policy network priors
ⓘ
value network evaluations ⓘ |
| searchTechnique | Monte Carlo tree search guided by neural networks ⓘ |
| trainingDataSource | self‑generated game data ⓘ |
| trainingMethod | self‑play ⓘ |
| trainingRegime | self‑play reinforcement learning without human examples ⓘ |
| uses |
neural networks
ⓘ
policy network ⓘ value network ⓘ |
Referenced by (20)
Full triples — surface form annotated when it differs from this entity's canonical label.
this entity surface form:
Mastering chess and shogi by self-play with a general reinforcement learning algorithm
this entity surface form:
A general reinforcement learning algorithm that masters chess, shogi, and Go through self‑play
this entity surface form:
DeepMind AlphaZero
this entity surface form:
AlphaZero (shogi) in DeepMind experiments