Research Engineer, Alignment Science

Anthropic

AI research company focused on creating reliable, interpretable, and steerable AI systems for safe and beneficial use.

London, UK • San Francisco, CA, USA

$225,000 - $500,000

Machine Learning

Senior Software Engineer

Hybrid

501 - 1,000 Employees

5+ years of experience

Description For Research Engineer, Alignment Science

Anthropic is seeking a Research Engineer to join their Alignment Science team, focusing on creating safe and beneficial AI systems. The role involves conducting experimental research on AI safety, particularly concerning powerful future systems (ASL-3 or ASL-4) under their Responsible Scaling Policy.

The position offers a unique opportunity to work on critical AI safety challenges, including AI Control and Alignment Evaluations. Key projects include testing safety technique robustness, running multi-agent reinforcement learning experiments, and building evaluation tools for LLM-generated jailbreaks.

The ideal candidate combines scientific and engineering mindsets, with experience in software development, ML, or research engineering. They should be collaborative, adaptable, and deeply concerned about AI's societal impact. Experience with LLMs, reinforcement learning, or AI safety research is highly valued.

Anthropic operates as a cohesive team focused on large-scale research efforts, viewing AI research as an empirical science. The company offers a collaborative environment with frequent research discussions and continues work in areas like GPT-3, Circuit-Based Interpretability, and Learning from Human Preferences.

The role is based in London with a requirement to spend at least 25% time in office and occasional travel to San Francisco. Anthropic provides competitive compensation, benefits, equity donation matching, generous leave policies, and a modern office space. The company values diversity and encourages applications from candidates with varied backgrounds and perspectives.

Last updated 23 days ago

Responsibilities For Research Engineer, Alignment Science

Test robustness of safety techniques
Run multi-agent reinforcement learning experiments
Build tooling for evaluating LLM-generated jailbreaks
Create evaluation questions for model reasoning
Contribute to research papers, blog posts, and talks
Support Responsible Scaling Policy implementation

Requirements For Research Engineer, Alignment Science

Python

Kubernetes

Bachelor's degree in a related field or equivalent experience
Significant software, ML, or research engineering experience
Experience contributing to empirical AI research projects
Familiarity with technical AI safety research
Strong collaborative skills
Interest in AI impacts

Benefits For Research Engineer, Alignment Science

Visa Sponsorship

Equity

Competitive compensation
Optional equity donation matching
Generous vacation
Parental leave
Flexible working hours
Modern office space
Visa sponsorship available

Anthropic

AI research company focused on creating reliable, interpretable, and steerable AI systems for safe and beneficial use.

London, UK • San Francisco, CA, USA

$225,000 - $500,000

Machine Learning

Senior Software Engineer

Hybrid

501 - 1,000 Employees

5+ years of experience

Interested in this job?

Jobs Related To Anthropic Research Engineer, Alignment Science

Software Engineer

Anthropic

Senior Software Engineer role at Anthropic focusing on building large-scale ML systems with emphasis on safety and reliability.

Research Engineer

Anthropic

Research Engineer position at Anthropic focusing on developing next-generation large language models with emphasis on safety and ethics.

Research Scientist/Engineer - Alignment Finetuning

Anthropic

Senior ML Research Scientist/Engineer position at Anthropic, focusing on AI alignment and language model finetuning with competitive compensation and benefits.

Biosecurity Research Engineer

Anthropic

Senior Machine Learning Engineer role focused on AI safety and biosecurity research at Anthropic

Safeguards Research Engineer

Anthropic

Senior AI Safety Research Engineer position at Anthropic, focusing on developing and implementing safety measures for advanced AI systems.