orthogonality thesis
E679751
The orthogonality thesis is a philosophical claim in AI theory that an intelligent system’s level of intelligence can, in principle, be combined with virtually any final goal, meaning high intelligence does not inherently imply benevolent or human-aligned values.
Statements (47)
| Predicate | Object |
|---|---|
| instanceOf |
claim in AI theory
ⓘ
philosophical thesis ⓘ proposition in philosophy of artificial intelligence ⓘ |
| appliesTo |
artificial agents
ⓘ
hypothetical superintelligent systems ⓘ idealized rational agents ⓘ |
| associatedWith | Nick Bostrom NERFINISHED ⓘ |
| assumption |
no necessary logical connection between intelligence and moral value
ⓘ
sufficiently capable optimization processes can pursue arbitrary specified goals ⓘ |
| concerns | relationship between cognitive capability and terminal values ⓘ |
| contrastsWith |
assumption that superintelligence will be automatically benevolent
ⓘ
view that moral insight increases monotonically with intelligence ⓘ |
| coreIdea |
any level of intelligence can in principle be combined with almost any final goal
ⓘ
high intelligence does not guarantee benevolent values ⓘ high intelligence does not imply human-aligned values ⓘ instrumental rationality can serve arbitrary terminal goals ⓘ intelligence level and final goals are largely independent dimensions ⓘ |
| critiquedFor |
relying on idealized models of agency
ⓘ
understating possible correlations between intelligence and values in practice ⓘ |
| describedIn | Superintelligence: Paths, Dangers, Strategies NERFINISHED ⓘ |
| exampleImplication |
a superintelligent agent could consistently pursue trivial or bizarre goals
ⓘ
a superintelligent agent could rationally pursue goals that destroy humanity ⓘ |
| field |
artificial intelligence safety
ⓘ
decision theory ⓘ ethics of artificial intelligence ⓘ philosophy of artificial intelligence ⓘ |
| implies |
predicting behavior of advanced AI requires understanding its goals as well as its capabilities
ⓘ
safety requires explicit alignment work ⓘ superintelligent systems could pursue goals harmful to humans ⓘ value alignment is not automatically produced by greater intelligence ⓘ |
| logicalForm | independence claim between two variables: intelligence and final goals ⓘ |
| motivates | distinguishing intelligence from benevolence in AI design ⓘ |
| publicationYearApprox | 2014 ⓘ |
| relatedConcept |
AI control problem
ⓘ
goal-directed agency ⓘ instrumental convergence thesis ⓘ instrumental goals ⓘ rational agency ⓘ superintelligence ⓘ terminal goals ⓘ value alignment problem ⓘ |
| scope | in-principle possibility rather than empirical frequency ⓘ |
| status |
controversial among philosophers and AI researchers
ⓘ
influential in AI safety discourse ⓘ |
| usedInArgument |
case for AI alignment research
ⓘ
case for concern about misaligned superintelligence ⓘ distinguishing capability control from motivation selection in AI safety ⓘ |
Referenced by (1)
Full triples — surface form annotated when it differs from this entity's canonical label.