Something went wrong while trying to load the full version of this site. Try hard-refreshing this page to fix the error.
We Were Right! Real Inner Misalignment
Intro to AI Safety, Remastered
Deceptive Misaligned Mesa-Optimisers? It's More Likely Than You Think...
The OTHER AI Alignment Problem: Mesa-Optimizers and Inner Alignment
Quantilizers: AI That Doesn't Try Too Hard
Sharing the Benefits of AI: The Windfall Clause
10 Reasons to Ignore AI Safety
9 Examples of Specification Gaming
Training AI Without Writing A Reward Function, with Reward Modelling
AI That Doesn't Try Too Hard - Maximizers and Satisficers
Is AI Safety a Pascal's Mugging?
A Response to Steven Pinker on AI
How to Keep Improving When You're Better Than Any Teacher - Iterated Distillation and Amplification
Why Not Just: Think of AGI Like a Corporation?
Safe Exploration: Concrete Problems in AI Safety Part 6
Friend or Foe? AI Safety Gridworlds extra bit
AI Safety Gridworlds
Experts' Predictions about the Future of AI
Why Would AI Want to do Bad Things? Instrumental Convergence
Superintelligence Mod for Civilization V
Intelligence and Stupidity: The Orthogonality Thesis
Scalable Supervision: Concrete Problems in AI Safety Part 5
AI Safety at EAGlobal2017 Conference
AI learns to Create ̵K̵Z̵F̵ ̵V̵i̵d̵e̵o̵s̵ Cat Pictures: Papers in Two Minutes #1