Evan Hubinger

E1038739

Evan Hubinger is an AI safety researcher known for his work on alignment and interpretability, and as one of the early technical leaders at Anthropic.

Try in SPARQL Jump to: Statements Referenced by

Statements (44)

Predicate Object
instanceOf person
associatedWithCommunity AI safety research community
effective altruism NERFINISHED
associatedWithOrganization Machine Intelligence Research Institute (MIRI) NERFINISHED
basedIn United States of America
surface form: United States
coAuthorOf Risks from Learned Optimization in Advanced Machine Learning Systems NERFINISHED
contributedTo formalization of inner vs outer alignment distinction
taxonomy of AI alignment failures
earlyMemberOf Anthropic technical leadership
educatedAt Harvey Mudd College NERFINISHED
employer Anthropic NERFINISHED
fieldOfStudy computer science
mathematics
fieldOfWork AI alignment
AI interpretability
focusesOn alignment of large-scale machine learning systems
technical AI safety
understanding learned optimization in neural networks
hasBlog https://www.alignmentforum.org
hasGitHubProfile https://github.com/evhub
hasOnlineHandle evhub NERFINISHED
hasPersonalWebsite https://evhub.github.io
hasPresentationOn inner alignment failures GENERATED
mesa-optimization GENERATED
hasResearchInterest deceptive alignment
interpretability tools for large models
scalable oversight
training dynamics of advanced AI systems
hasTalkOn AI alignment at EAG conferences
AI safety at technical workshops
knownAs Evan Hubinger NERFINISHED
languageSpoken English
mainSubjectOfWork AI safety threat models
inner alignment
mesa-optimizers in machine learning systems
outer alignment
notableFor research on mesa-optimization
work on AI alignment
work on AI interpretability
occupation AI safety researcher
positionHeld research scientist at Anthropic
publishesPreprintsOn arXiv NERFINISHED
writesFor Alignment Forum NERFINISHED
LessWrong NERFINISHED

Referenced by (1)

Full triples — surface form annotated when it differs from this entity's canonical label.

Anthropic foundedBy Evan Hubinger