Research Engineer / Scientist, Alignment Science

Anthropic's mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole.
$280,000 - $625,000
Machine Learning
Staff Software Engineer
Hybrid
5+ years of experience

Description For Research Engineer / Scientist, Alignment Science

Anthropic is seeking a Research Engineer / Scientist for their Alignment Science team. This role involves building and running elegant and thorough machine learning experiments to help understand and steer the behavior of powerful AI systems. The focus is on making AI helpful, honest, and harmless, with an emphasis on risks from powerful future systems.

Key responsibilities include:

  • Contributing to exploratory experimental research on AI safety
  • Testing the robustness of safety techniques
  • Running multi-agent reinforcement learning experiments
  • Building tooling to evaluate LLM-generated jailbreaks
  • Writing scripts and prompts for evaluation questions
  • Contributing to research papers, blog posts, and talks
  • Running experiments that feed into key AI safety efforts

The ideal candidate should have:

  • Significant software, ML, or research engineering experience
  • Experience contributing to empirical AI research projects
  • Familiarity with technical AI safety research
  • Preference for fast-moving collaborative projects
  • Willingness to pick up slack outside of job description
  • Care about the impacts of AI

Strong candidates may also have:

  • Experience authoring research papers in ML, NLP, or AI safety
  • Experience with LLMs and reinforcement learning
  • Experience with Kubernetes clusters and complex shared codebases

This role offers a competitive salary range of $280,000 - $625,000 USD, along with equity and comprehensive benefits. The position is hybrid, with an expectation to be in the office at least 25% of the time, and a preference for candidates able to be based in the Bay Area.

Anthropic values diversity and is committed to creating an inclusive environment. They encourage applications from candidates who may not meet every qualification but are passionate about AI safety and align with their mission.

Last updated 7 months ago

Responsibilities For Research Engineer / Scientist, Alignment Science

  • Build and run machine learning experiments to understand and steer powerful AI systems
  • Contribute to exploratory experimental research on AI safety
  • Test the robustness of safety techniques
  • Run multi-agent reinforcement learning experiments
  • Build tooling to evaluate LLM-generated jailbreaks
  • Write scripts and prompts for evaluation questions
  • Contribute to research papers, blog posts, and talks
  • Run experiments that feed into key AI safety efforts

Requirements For Research Engineer / Scientist, Alignment Science

Python
  • Significant software, ML, or research engineering experience
  • Experience contributing to empirical AI research projects
  • Familiarity with technical AI safety research
  • Preference for fast-moving collaborative projects
  • Willingness to pick up slack outside of job description
  • Care about the impacts of AI

Benefits For Research Engineer / Scientist, Alignment Science

Equity
Medical Insurance
Dental Insurance
Vision Insurance
401k
Education Budget
Parental Leave
  • Equity
  • Comprehensive health, dental, and vision insurance
  • 401(k) plan with 4% matching
  • 22 weeks of paid parental leave
  • Unlimited PTO
  • Stipends for education, home office improvements, commuting, and wellness
  • Fertility benefits via Carrot
  • Daily lunches and snacks in the office
  • Relocation support for those moving to the Bay Area

Interested in this job?

Jobs Related To Anthropic Research Engineer / Scientist, Alignment Science

Research Scientist/Engineer - Finetuning Alignment

Research Scientist/Engineer position at Anthropic focusing on developing truthful and reliable AI systems through advanced finetuning and alignment techniques.

Research Scientist/Engineer - Finetuning Alignment

Research Scientist/Engineer position at Anthropic focusing on developing truthful and reliable AI systems through advanced finetuning and alignment techniques.

Developer Relations Lead

Lead Developer Relations at Anthropic, shaping how developers experience and build with Claude AI through technical programs, events, and community engagement.

Interpretability Research Engineer

Senior research engineering role at Anthropic focusing on AI interpretability and safety, offering competitive compensation and the opportunity to work on cutting-edge AI systems.

Developer Relations Lead

Lead Developer Relations at Anthropic, shaping how developers experience and build with Claude AI through technical programs, events, and community engagement.