AlphaZero

E40166

artificial intelligence system game‑playing program

AlphaZero is a DeepMind-developed artificial intelligence system that mastered complex games like chess, shogi, and Go through self-play reinforcement learning without human-crafted strategies.

Aliases (3)

Statements (53)

Predicate	Object
instanceOf	artificial intelligence system → game‑playing program →
architectureType	deep neural network with Monte Carlo tree search →
basedOn	Monte Carlo tree search → deep learning → reinforcement learning →
contrastWith	programs relying on human expert knowledge → traditional chess engines using alpha‑beta search →
countryOfOrigin	United Kingdom →
creatorOrganizationType	AI research lab →
defeated	Elmo shogi engine → Stockfish 8 → previous Go programs based on AlphaGo Zero →
designedFor	Go → chess → shogi →
developer	DeepMind → Google DeepMind →
doesNotUse	endgame tablebases for search guidance → human‑crafted opening books →
evaluationFunction	learned value function →
field	artificial intelligence → computer Go → computer chess → computer shogi → machine learning →
firstPublicAnnouncementDate	2017-12-06 →
firstPublicAnnouncementYear	2017 →
gameRepresentation	board positions encoded for neural networks →
generalizationProperty	single algorithm applied to multiple games →
hardwareUsed	TPUs →
learningObjective	maximize expected game outcome →
learningParadigm	tabula rasa learning →
notableFor	mastering Go through self‑play → mastering chess through self‑play → mastering shogi through self‑play →
outperforms	AlphaGo Zero → Elmo → Stockfish →
parentProject	AlphaGo project →
policyRepresentation	probability distribution over moves →
publicationTitle	A general reinforcement learning algorithm that masters chess, shogi, and Go through self‑play →
publishedIn	Science →
rewardSignal	game result win‑draw‑loss →
searchGuidance	policy network priors → value network evaluations →
searchTechnique	Monte Carlo tree search guided by neural networks →
trainingDataSource	self‑generated game data →
trainingMethod	self‑play →
trainingRegime	self‑play reinforcement learning without human examples →
uses	neural networks → policy network → value network →

Referenced by (15)

Subject (surface form when different)	Predicate
David Silver → DeepMind → Demis Hassabis →	knownFor
DeepMind ("AlphaGo Zero") → DeepMind →	developed
AlphaGo ("AlphaGo Zero") → AlphaGo →	inspired
AlphaGo ("AlphaGo Zero") → AlphaGo →	successor
MuZero →	comparedTo
MuZero →	inspiredBy
David Silver ("Mastering chess and shogi by self-play with a general reinforcement learning algorithm") →	notablePaper
David Silver →	notableWork
AlphaZero ("A general reinforcement learning algorithm that masters chess, shogi, and Go through self‑play") →	publicationTitle
AlphaStar →	relatedTo