Safeguards Research Engineer

Anthropic

AI research company focused on creating reliable, interpretable, and steerable AI systems for safe and beneficial deployment.

San Francisco, CA, USA

$320,000 - $560,000

Machine Learning

Senior Software Engineer

Hybrid

501 - 1,000 Employees

5+ years of experience

Description For Safeguards Research Engineer

Anthropic is seeking a Safeguards Research Engineer to join their mission of creating safe and beneficial AI systems. As part of the Safeguards Research Team, you'll conduct critical safety research and engineering to ensure AI systems can be deployed safely. The role involves working on immediate safety challenges and longer-term research initiatives, including jailbreak robustness, automated red-teaming, and applied threat modeling.

You'll collaborate with multiple teams including Interpretability, Fine-Tuning, Frontier Red Team, and Alignment Science. The position requires both scientific and engineering mindsets, focusing on risks from current and future powerful AI systems. Projects include testing safety technique robustness, running multi-agent reinforcement learning experiments, and building evaluation tools for LLM-generated jailbreaks.

Anthropic operates as a cohesive team focused on large-scale research efforts, valuing impact over smaller specific puzzles. The company views AI research as an empirical science and maintains a highly collaborative environment. The role offers competitive compensation ($320,000-$560,000), flexible working arrangements, and comprehensive benefits including visa sponsorship.

The ideal candidate will have significant software and ML experience, familiarity with AI safety research, and strong collaborative skills. Experience with LLMs, reinforcement learning, and research paper authorship is highly valued. While based in San Francisco, the role requires at least 25% office presence, with flexibility for remote work. Join Anthropic in their mission to advance steerable, trustworthy AI while working on cutting-edge safety challenges in the field.

Last updated a month ago

Responsibilities For Safeguards Research Engineer

Conduct critical safety research and engineering for AI systems
Test robustness of safety techniques through model training
Run multi-agent reinforcement learning experiments
Build tooling to evaluate LLM-generated jailbreaks
Write scripts and prompts for model evaluation
Contribute to research papers, blog posts, and talks
Run experiments for Responsible Scaling Policy implementation

Requirements For Safeguards Research Engineer

Python

Kubernetes

Bachelor's degree in a related field or equivalent experience
Significant software, ML, or research engineering experience
Experience contributing to empirical AI research projects
Familiarity with technical AI safety research
Ability to work in collaborative projects
Must be able to travel 25% to the Bay Area

Benefits For Safeguards Research Engineer

Visa Sponsorship

Equity

Competitive compensation and benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours
Office space in San Francisco
Visa sponsorship available

Anthropic

AI research company focused on creating reliable, interpretable, and steerable AI systems for safe and beneficial deployment.

San Francisco, CA, USA

$320,000 - $560,000

Machine Learning

Senior Software Engineer

Hybrid

501 - 1,000 Employees

5+ years of experience

Interested in this job?

Jobs Related To Anthropic Safeguards Research Engineer

Software Engineer

Anthropic

Senior Software Engineer role at Anthropic focusing on building large-scale ML systems with emphasis on safety and reliability.

Research Engineer

Anthropic

Research Engineer position at Anthropic focusing on developing next-generation large language models with emphasis on safety and ethics.

Research Engineer, Alignment Science

Anthropic

Research Engineer position focusing on AI safety and alignment research at Anthropic's London office.

Research Scientist/Engineer - Alignment Finetuning

Anthropic

Senior ML Research Scientist/Engineer position at Anthropic, focusing on AI alignment and language model finetuning with competitive compensation and benefits.

Biosecurity Research Engineer

Anthropic

Senior Machine Learning Engineer role focused on AI safety and biosecurity research at Anthropic