Anthropic Hires Ex-OpenAI Safety Lead for New Team

0:00

Jan Leike, a prominent AI researcher who recently resigned from OpenAI and publicly criticized the company’s AI safety approach, has joined OpenAI competitor Anthropic to lead a new “superalignment” team.

In a post on X, Leike stated that his team at Anthropic will concentrate on various AI safety and security aspects, specifically including “scalable oversight,” “weak-to-strong generalization,” and automated alignment research.

I’m excited to join @AnthropicAI to continue the superalignment mission!

My new team will work on scalable oversight, weak-to-strong generalization, and automated alignment research.

If you’re interested in joining, my dms are open.

— Jan Leike (@janleike) May 28, 2024

A source informed that Leike will report directly to Jared Kaplan, Anthropic’s chief science officer. Anthropic researchers who are currently focusing on scalable oversight — techniques to manage large-scale AI’s behavior predictably and favorably — will begin reporting to Leike as his team is established.

✨🪩 Woo! 🪩✨

Jan’s led some seminally important work on technical AI safety and I’m thrilled to be working with him! We’ll be leading twin teams aimed at different parts of the problem of aligning AI systems at human level and beyond. https://t.co/aqSFTnOEG0

— Sam Bowman (@sleepinyourhat) May 28, 2024

In many ways, Leike’s new team seems to have a similar mission to OpenAI’s recently-dissolved Superalignment team. The Superalignment team, which Leike co-led, had the ambitious aim of resolving the core technical challenges of managing superintelligent AI within four years but was often hindered by OpenAI’s leadership.

Anthropic has consistently tried to position itself as more focused on safety compared to OpenAI.

Anthropic’s CEO, Dario Amodei, previously served as the VP of research at OpenAI and allegedly split from OpenAI due to a disagreement over the company’s direction — specifically, OpenAI’s growing commercial focus. Amodei took with him a number of OpenAI employees, including OpenAI’s former policy lead Jack Clark.