Original listing text, shown exactly as published by the company.
Across the workstreams, you may be a good fit if you
- Are motivated by making sure AI is safe and beneficial for society as a whole
- Are excited to transition into empirical AI research and would be interested in a full-time role at Anthropic
- Have a strong technical background in computer science, mathematics, or physics
- Thrive in fast-paced, collaborative environments
- Can implement ideas quickly and communicate clearly
Strong candidates may also have
- Strong background in a discipline relevant to a specific Fellows workstream (e.g. economics, social sciences, or cybersecurity)
- Experience in areas of research or engineering related to their workstream
Candidates must be
- Fluent in Python programming
- Available to work full-time on the Fellows program
AI Safety Fellows
Mentors, research areas, & past projects
Fellows will undergo a project selection & mentor matching process. Potential mentors include
- Sam Bowman
- Sara Price
- Alex Tamkin
- Nina Panickssery
- Trenton Bricken
- Logan Graham
- Jascha Sohl-Dickstein
- Joe Benton
- Fabien Roger
- Samuel Marks
- Kyle Fish
- Ethan Perez
Note: You may research mentors' prior work, but all applications must go through the official form, not the mentors.
Our mentors will lead projects in select AI safety research areas, such as
- Scalable Oversight: Developing techniques to keep highly capable models helpful and honest, even as they surpass human-level intelligence in various domains.
- Adversarial Robustness and AI Control: Creating methods to ensure advanced AI systems remain safe and harmless in unfamiliar or adversarial scenarios.
- Model Organisms: Creating model organisms of misalignment to improve our empirical understanding of how alignment failures might arise.
- Model Internals / Mechanistic Interpretability: Advancing our understanding of the internal workings of large language models to enable more targeted interventions and safety measures.
- AI Welfare: Improving our understanding of potential AI welfare and developing related evaluations and mitigations.
On our Alignment Science and Frontier Red Team blogs, you can read about past projects, including
- Subliminal Learning: Language Models Transmit Behavioral Traits via Hidden Signals in Data: Alex Cloud and Minh Le, et al., mentors including Samuel Marks and Owain Evans
- Open-source circuits: Michael Hanna and Mateusz Piotrowski with mentorship from Emmanuel Ameisen and Jack Lindsey
For a full list of representative projects for each area, please see these blog posts: Introducing the Anthropic Fellows Program for AI Safety Research, Recommendations for Technical AI Safety Research Directions.
Unique candidate criteria
You might be a particularly great fit for this workstream if you
- Are motivated by reducing catastrophic risks from advanced AI systems
- Have experience with empirical ML research projects
- Have experience working with large language models
- Have experience in one of the research areas mentioned above
- Have a track record of open-source contributions
AI Security Fellows
Mentors, research areas, & past projects
Fellows will undergo a project selection & mentor matching process. Potential mentors include
- Nicholas Carlini
- Keri Warr
- Evyatar Ben Asher
- Keane Lucas
- Newton Cheng
On our Alignment Science and Frontier Red Team blogs, you can read about some past Fellows projects, including:
- AI agents find $4.6M in blockchain smart contract exploits: Winnie Xiao and Cole Killian, mentored by Nicholas Carlini and Alwin Peng
- Strengthening Red Teams: A Modular Scaffold for Control Evaluations: Chloe Loughridge et al., mentored by Jon Kutasov and Joe Benton
Unique candidate criteria
You might be a particularly great fit for this workstream if you
- Are motivated by reducing catastrophic risks from advanced AI systems
- Have contributed to open-source projects in LLM- or security-adjacent repositories
- Have demonstrated success in bringing clarity and ownership to ambiguous technical problems
- Have experience with pentesting, vulnerability research, or other offensive security work
- Have a demonstrated willingness to do the "dirty work" that produces high-quality outputs
- Have reported CVEs or been awarded bug bounties
- Have experience with empirical ML research projects
- Have experience with deep learning frameworks and experiment management
ML Systems & Performance Fellows
Mentors, research areas, & past projects
Fellows will undergo a project selection & mentor matching process. Potential mentors include
- Alwin Peng
- Zygi Straznickas
For a past example of an engineering-heavy project, see
- AI agents find $4.6M in blockchain smart contract exploits
Projects in this workstream may include
- Building a CPU simulator for accelerator workloads
- Adding backends for different accelerators on an open source project
- Building on demand infrastructure for other infrastructure heavy fellows projects
- Building complex synthetic data or environment pipelines
Unique candidate criteria
You might be a particularly great fit for this workstream if you
- Have strong software engineering skills with experience building complex ML systems
- Can balance research exploration with engineering rigor and operational reliability
- Enjoy collaborating across research and engineering disciplines
- Are comfortable working with large-scale distributed systems and high-performance computing (e.g. in trading)
- Have experience with training, fine-tuning, or evaluating large language models
- Are adept at analyzing and debugging model training processes
Reinforcement Learning Fellows
Mentors, research areas, & past projects
Fellows will undergo a project selection & mentor matching process. Potential research areas and mentors include:
- Ruhua Jiang
- Kaidi Cao
- Sunny Duan
- David Brandfonbrener
- Colt Steele
- Dino Distefano
- Will Williams
Projects in this workstream may include
- Building model-based tools to better understand AI training data and improve training data quality
- A research project to better understand generalization
- Creating RL environments to improve Claude models at capabilities that are within your domain of expertise
- Building RL environments for safety-related tasks
- Conducting research and implementing solutions in areas such as RL algorithms
Unique candidate criteria
You might be a particularly great fit for this workstream if you
- Have strong software engineering skills with experience building complex ML systems
- Can balance research exploration with engineering rigor and operational reliability
- Enjoy collaborating across research and engineering disciplines
- Are comfortable working with large-scale distributed systems and high-performance computing
- Have experience with training, fine-tuning, or evaluating large language models
- Are adept at analyzing and debugging model training processes
The Anthropic Institute Fellows (Economics & Policy)
Mentors, research areas, & past projects
Fellows will undergo a project selection & mentor matching process. Potential research areas and mentors include:
- Economics
- Maxim Massenkoff
- Peter McCrory
- Policy, Security, and Society
- Jack Clark
- Marina Favaro
- Jim Baker
Projects in this workstream may include
- Designing and conducting empirical research on AI's economic effects, drawing on external data sources
- Developing new methodological approaches for studying AI's impact on labor markets, the future of work, and society
- Analysing the offense–defense balance for AI-enabled cyber and bio capabilities as models scale
- Measuring the extent to which model performance increases with custom harnesses?
- Identifying market driven mechanisms that could improve societal resilience to anticipated threats from AI systems?
- Identifying which metrics relating to AI R&D could service as early warning signals for recursive self-improvement
For past project examples, see
- How AI Impacts Skill Formation: Judy Shen and Alex Tamkin
- Stress-Testing Model Specs Reveals Character Differences among Language Models: Jifan Zhang, Henry Sleight, Andi Peng, John Schulman, and Esin Durmus
Unique candidate criteria
You might be a particularly great fit for this workstream if you
- Have an interest in economics or policy research; prior experience in this area is a plus but not required
- Are adaptable and collaborative, able to take direction and contribute to team priorities rather than needing to pursue a predetermined research agenda…