Research Engineer / Scientist, Alignment Science

Anthropic

Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole.

San Francisco, CA, USA • Seattle, WA, USA • New York, NY, USA

$280,000 - $625,000

Machine Learning

Staff Software Engineer

Hybrid

5+ years of experience

Description For Research Engineer / Scientist, Alignment Science

Anthropic is seeking a Research Engineer / Scientist for their Alignment Science team. This role involves building and running elegant and thorough machine learning experiments to help understand and steer the behavior of powerful AI systems. The focus is on making AI helpful, honest, and harmless, with an emphasis on risks from powerful future systems.

Key responsibilities include:

Contributing to exploratory experimental research on AI safety
Testing the robustness of safety techniques
Running multi-agent reinforcement learning experiments
Building tooling to evaluate LLM-generated jailbreaks
Writing scripts and prompts for evaluation questions
Contributing to research papers, blog posts, and talks
Running experiments that feed into key AI safety efforts

The ideal candidate should have:

Significant software, ML, or research engineering experience
Experience contributing to empirical AI research projects
Familiarity with technical AI safety research
Preference for fast-moving collaborative projects
Willingness to pick up slack outside of job description
Care about the impacts of AI

Strong candidates may also have:

Experience authoring research papers in ML, NLP, or AI safety
Experience with LLMs and reinforcement learning
Experience with Kubernetes clusters and complex shared codebases

This role offers a competitive salary range of $280,000 - $625,000 USD, along with equity and comprehensive benefits. The position is hybrid, with an expectation to be in the office at least 25% of the time, and a preference for candidates able to be based in the Bay Area.

Anthropic values diversity and is committed to creating an inclusive environment. They encourage applications from candidates who may not meet every qualification but are passionate about AI safety and align with their mission.

Last updated 7 months ago

Responsibilities For Research Engineer / Scientist, Alignment Science

Build and run machine learning experiments to understand and steer powerful AI systems
Contribute to exploratory experimental research on AI safety
Test the robustness of safety techniques
Run multi-agent reinforcement learning experiments
Build tooling to evaluate LLM-generated jailbreaks
Write scripts and prompts for evaluation questions
Contribute to research papers, blog posts, and talks
Run experiments that feed into key AI safety efforts

Requirements For Research Engineer / Scientist, Alignment Science

Python

Significant software, ML, or research engineering experience
Experience contributing to empirical AI research projects
Familiarity with technical AI safety research
Preference for fast-moving collaborative projects
Willingness to pick up slack outside of job description
Care about the impacts of AI

Benefits For Research Engineer / Scientist, Alignment Science

Equity

Medical Insurance

Dental Insurance

Vision Insurance

401k

Education Budget

Parental Leave

Equity
Comprehensive health, dental, and vision insurance
401(k) plan with 4% matching
22 weeks of paid parental leave
Unlimited PTO
Stipends for education, home office improvements, commuting, and wellness
Fertility benefits via Carrot
Daily lunches and snacks in the office
Relocation support for those moving to the Bay Area

Anthropic

Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole.

San Francisco, CA, USA • Seattle, WA, USA • New York, NY, USA

$280,000 - $625,000

Machine Learning

Staff Software Engineer

Hybrid

5+ years of experience

Interested in this job?

Jobs Related To Anthropic Research Engineer / Scientist, Alignment Science

Research Scientist/Engineer - Finetuning Alignment

Anthropic

Research Scientist/Engineer position at Anthropic focusing on developing truthful and reliable AI systems through advanced finetuning and alignment techniques.

Research Scientist/Engineer - Finetuning Alignment

Anthropic

Research Scientist/Engineer position at Anthropic focusing on developing truthful and reliable AI systems through advanced finetuning and alignment techniques.

Developer Relations Lead

Anthropic

Lead Developer Relations at Anthropic, shaping how developers experience and build with Claude AI through technical programs, events, and community engagement.

Interpretability Research Engineer

Anthropic

Senior research engineering role at Anthropic focusing on AI interpretability and safety, offering competitive compensation and the opportunity to work on cutting-edge AI systems.

Developer Relations Lead

Anthropic

Lead Developer Relations at Anthropic, shaping how developers experience and build with Claude AI through technical programs, events, and community engagement.