Research Engineer, Alignment Science

AI research company focused on creating reliable, interpretable, and steerable AI systems for safe and beneficial use.
$280,000 - $690,000
Machine Learning
Mid-Level Software Engineer
Hybrid
501 - 1,000 Employees
3+ years of experience
AI

Description For Research Engineer, Alignment Science

Anthropic is seeking a Research Engineer to join their Alignment Science team, focusing on creating safe and beneficial AI systems. The role combines scientific and engineering expertise to conduct exploratory experimental research on AI safety, particularly concerning powerful future systems.

The position involves working on critical projects including Scalable Oversight, AI Control, Alignment Stress-testing, and Automated Alignment Research. You'll be conducting machine learning experiments, testing safety techniques, running multi-agent reinforcement learning experiments, and building evaluation tools for LLM systems.

Anthropic operates as a cohesive team focused on large-scale research efforts, viewing AI research as an empirical science. The company values impact and collaboration, with frequent research discussions to ensure high-impact work. Their research builds on previous work including GPT-3, Circuit-Based Interpretability, Multimodal Neurons, and Scaling Laws.

The ideal candidate should have significant software, ML, or research engineering experience, familiarity with technical AI safety research, and a preference for collaborative projects. Strong candidates may have experience with LLMs, reinforcement learning, and complex shared codebases. The role requires at least 25% time in the Bay Area office.

As a public benefit corporation, Anthropic offers competitive compensation, benefits including equity donation matching, generous vacation and parental leave, flexible working hours, and a collaborative office space in San Francisco. They value diverse perspectives and encourage applications from candidates who might not meet every qualification, recognizing the social and ethical implications of their work.

Last updated 3 days ago

Responsibilities For Research Engineer, Alignment Science

  • Build and run machine learning experiments
  • Test robustness of safety techniques
  • Run multi-agent reinforcement learning experiments
  • Build tooling to evaluate LLM-generated jailbreaks
  • Write scripts and prompts for evaluation questions
  • Contribute to research papers, blog posts, and talks
  • Run experiments for AI safety efforts

Requirements For Research Engineer, Alignment Science

Python
Kubernetes
  • Bachelor's degree in a related field or equivalent experience
  • Significant software, ML, or research engineering experience
  • Experience contributing to empirical AI research projects
  • Familiarity with technical AI safety research
  • Ability to work collaboratively
  • Must be able to travel 25% to the Bay Area

Benefits For Research Engineer, Alignment Science

Visa Sponsorship
Parental Leave
  • Competitive compensation
  • Optional equity donation matching
  • Generous vacation
  • Parental leave
  • Flexible working hours
  • Office space in San Francisco
  • Visa sponsorship available

Interested in this job?

Jobs Related To Anthropic Research Engineer, Alignment Science

Multimodal AI Researcher/Engineer

Join Anthropic as a Multimodal AI Researcher/Engineer to develop safe and beneficial AI systems, working on frontier multimodal models and infrastructure.

Research Engineer - Societal Impacts

Research Engineer position at Anthropic focusing on societal impacts of AI systems, infrastructure development, and safety research.

ML Systems Engineer

ML Systems Engineer role at Anthropic focusing on building and improving AI model training systems, offering competitive salary and hybrid work in San Francisco.

Machine Learning Systems Engineer, RL Engineering

Machine Learning Systems Engineer role at Anthropic, focusing on building and improving AI training systems for models like Claude.

Java AI Developer

Java AI Developer position at Capco focusing on developing and optimizing AI-powered applications for financial services.