orthogonality thesis

E679751

The orthogonality thesis is a philosophical claim in AI theory that an intelligent system’s level of intelligence can, in principle, be combined with virtually any final goal, meaning high intelligence does not inherently imply benevolent or human-aligned values.

Try in SPARQL Jump to: Statements Referenced by

Statements (47)

Predicate Object
instanceOf claim in AI theory
philosophical thesis
proposition in philosophy of artificial intelligence
appliesTo artificial agents
hypothetical superintelligent systems
idealized rational agents
associatedWith Nick Bostrom NERFINISHED
assumption no necessary logical connection between intelligence and moral value
sufficiently capable optimization processes can pursue arbitrary specified goals
concerns relationship between cognitive capability and terminal values
contrastsWith assumption that superintelligence will be automatically benevolent
view that moral insight increases monotonically with intelligence
coreIdea any level of intelligence can in principle be combined with almost any final goal
high intelligence does not guarantee benevolent values
high intelligence does not imply human-aligned values
instrumental rationality can serve arbitrary terminal goals
intelligence level and final goals are largely independent dimensions
critiquedFor relying on idealized models of agency
understating possible correlations between intelligence and values in practice
describedIn Superintelligence: Paths, Dangers, Strategies NERFINISHED
exampleImplication a superintelligent agent could consistently pursue trivial or bizarre goals
a superintelligent agent could rationally pursue goals that destroy humanity
field artificial intelligence safety
decision theory
ethics of artificial intelligence
philosophy of artificial intelligence
implies predicting behavior of advanced AI requires understanding its goals as well as its capabilities
safety requires explicit alignment work
superintelligent systems could pursue goals harmful to humans
value alignment is not automatically produced by greater intelligence
logicalForm independence claim between two variables: intelligence and final goals
motivates distinguishing intelligence from benevolence in AI design
publicationYearApprox 2014
relatedConcept AI control problem
goal-directed agency
instrumental convergence thesis
instrumental goals
rational agency
superintelligence
terminal goals
value alignment problem
scope in-principle possibility rather than empirical frequency
status controversial among philosophers and AI researchers
influential in AI safety discourse
usedInArgument case for AI alignment research
case for concern about misaligned superintelligence
distinguishing capability control from motivation selection in AI safety

Referenced by (1)

Full triples — surface form annotated when it differs from this entity's canonical label.