Research Engineer, Alignment Science

Anthropic

Anthropic creates reliable, interpretable, and steerable AI systems for safe and beneficial use.

San Francisco, CA, USA

$230,000 - $515,000

Machine Learning

Hybrid

Description For Research Engineer, Alignment Science

Anthropic is seeking a Research Engineer for their Alignment Science team to contribute to exploratory experimental research on AI safety. The role involves building and running machine learning experiments to understand and steer powerful AI systems, with a focus on risks from future systems. Key responsibilities include testing safety techniques, running multi-agent reinforcement learning experiments, building evaluation tools, and contributing to research papers and talks. The ideal candidate has significant software, ML, or research engineering experience, familiarity with technical AI safety research, and a collaborative work style. Experience with LLMs, reinforcement learning, and complex codebases is a plus. The position offers competitive compensation, including salary (£230,000 — £515,000 GBP), equity, and comprehensive benefits. Anthropic values diversity and encourages applications from underrepresented groups. The company operates on a hybrid work model with at least 25% in-office time, preferably in the Bay Area, and offers visa sponsorship. Anthropic is committed to big science AI research, working as a cohesive team on large-scale efforts to advance steerable, trustworthy AI.

Last updated 2 months ago

Responsibilities For Research Engineer, Alignment Science

Test robustness of safety techniques by training language models to subvert interventions
Run multi-agent reinforcement learning experiments
Build tooling to evaluate effectiveness of novel LLM-generated jailbreaks
Write scripts and prompts for evaluation questions on models' reasoning abilities
Contribute ideas, figures, and writing to research papers, blog posts, and talks
Run experiments for key AI safety efforts, including Responsible Scaling Policy

Requirements For Research Engineer, Alignment Science

Significant software, ML, or research engineering experience
Experience contributing to empirical AI research projects
Familiarity with technical AI safety research
Preference for fast-moving collaborative projects
Willingness to pick up slack outside job description
Care about the impacts of AI

Benefits For Research Engineer, Alignment Science

401k

Dental Insurance

Equity

Medical Insurance

Parental Leave

Relocation Benefits

Vision Insurance

Equity donation matching
Comprehensive health, dental, and vision insurance
401(k) plan with 4% matching (US)
22 weeks of paid parental leave (US)
Unlimited PTO
Stipends for education, home office improvements, commuting, and wellness (US)
Fertility benefits via Carrot (US)
Daily lunches and snacks in office
Relocation support for Bay Area (US)
Private health, dental, and vision insurance (UK)
Pension contribution matching 4% of salary (UK)
21 weeks of paid parental leave (UK)
Health cash plan (UK)
Life insurance and income protection (UK)