Research Engineer, Alignment Science

AI research company focused on creating reliable, interpretable, and steerable AI systems for safe and beneficial use.
$225,000 - $500,000
Machine Learning
Senior Software Engineer
Hybrid
501 - 1,000 Employees
5+ years of experience
AI

Description For Research Engineer, Alignment Science

Anthropic is seeking a Research Engineer to join their Alignment Science team, focusing on creating safe and beneficial AI systems. The role involves conducting experimental research on AI safety, particularly concerning powerful future systems (ASL-3 or ASL-4) under their Responsible Scaling Policy.

The position offers a unique opportunity to work on critical AI safety challenges, including AI Control and Alignment Evaluations. Key projects include testing safety technique robustness, running multi-agent reinforcement learning experiments, and building evaluation tools for LLM-generated jailbreaks.

The ideal candidate combines scientific and engineering mindsets, with experience in software development, ML, or research engineering. They should be collaborative, adaptable, and deeply concerned about AI's societal impact. Experience with LLMs, reinforcement learning, or AI safety research is highly valued.

Anthropic operates as a cohesive team focused on large-scale research efforts, viewing AI research as an empirical science. The company offers a collaborative environment with frequent research discussions and continues work in areas like GPT-3, Circuit-Based Interpretability, and Learning from Human Preferences.

The role is based in London with a requirement to spend at least 25% time in office and occasional travel to San Francisco. Anthropic provides competitive compensation, benefits, equity donation matching, generous leave policies, and a modern office space. The company values diversity and encourages applications from candidates with varied backgrounds and perspectives.

Last updated 23 days ago

Responsibilities For Research Engineer, Alignment Science

  • Test robustness of safety techniques
  • Run multi-agent reinforcement learning experiments
  • Build tooling for evaluating LLM-generated jailbreaks
  • Create evaluation questions for model reasoning
  • Contribute to research papers, blog posts, and talks
  • Support Responsible Scaling Policy implementation

Requirements For Research Engineer, Alignment Science

Python
Kubernetes
  • Bachelor's degree in a related field or equivalent experience
  • Significant software, ML, or research engineering experience
  • Experience contributing to empirical AI research projects
  • Familiarity with technical AI safety research
  • Strong collaborative skills
  • Interest in AI impacts

Benefits For Research Engineer, Alignment Science

Visa Sponsorship
Equity
  • Competitive compensation
  • Optional equity donation matching
  • Generous vacation
  • Parental leave
  • Flexible working hours
  • Modern office space
  • Visa sponsorship available

Interested in this job?

Jobs Related To Anthropic Research Engineer, Alignment Science

Software Engineer

Senior Software Engineer role at Anthropic focusing on building large-scale ML systems with emphasis on safety and reliability.

Research Engineer

Research Engineer position at Anthropic focusing on developing next-generation large language models with emphasis on safety and ethics.

Research Scientist/Engineer - Alignment Finetuning

Senior ML Research Scientist/Engineer position at Anthropic, focusing on AI alignment and language model finetuning with competitive compensation and benefits.

Biosecurity Research Engineer

Senior Machine Learning Engineer role focused on AI safety and biosecurity research at Anthropic

Safeguards Research Engineer

Senior AI Safety Research Engineer position at Anthropic, focusing on developing and implementing safety measures for advanced AI systems.