Research Engineer, Alignment Science

Anthropic

AI research company focused on creating reliable, interpretable, and steerable AI systems for safe and beneficial use.

San Francisco, CA, USA

$280,000 - $690,000

Machine Learning

Mid-Level Software Engineer

Hybrid

501 - 1,000 Employees

3+ years of experience

This job posting may no longer be active. You may be interested in these related jobs instead:

Research Engineer - Societal Impacts

Anthropic

Research Engineer position at Anthropic focusing on societal impacts of AI systems, infrastructure development, and safety research.

Machine Learning Systems Engineer, RL Engineering

Anthropic

Machine Learning Systems Engineer role at Anthropic, focusing on building and improving AI training systems for models like Claude.

Software Engineer II, Customer eXperience Impressions (CXI)

Amazon

Software Engineer position at Amazon's CXI team developing ML systems to detect and fix shopping experience issues, offering competitive pay and benefits.

Software Development Engineer II - DSO, Demand Science Optimization (DSO)

Amazon

Software Development Engineer II position at Amazon's DSO team, focusing on ML-driven demand forecasting and supply management for Amazon Devices.

Software Development Engineer, Predictive Targeting

Amazon

Software Development Engineer role at Amazon focusing on machine learning and predictive analytics for customer targeting systems

Description For Research Engineer, Alignment Science

Anthropic is seeking a Research Engineer to join their Alignment Science team, focusing on creating safe and beneficial AI systems. The role combines scientific and engineering expertise to conduct exploratory experimental research on AI safety, particularly concerning powerful future systems.

The position involves working on critical projects including Scalable Oversight, AI Control, Alignment Stress-testing, and Automated Alignment Research. You'll be conducting machine learning experiments, testing safety techniques, running multi-agent reinforcement learning experiments, and building evaluation tools for LLM systems.

Anthropic operates as a cohesive team focused on large-scale research efforts, viewing AI research as an empirical science. The company values impact and collaboration, with frequent research discussions to ensure high-impact work. Their research builds on previous work including GPT-3, Circuit-Based Interpretability, Multimodal Neurons, and Scaling Laws.

The ideal candidate should have significant software, ML, or research engineering experience, familiarity with technical AI safety research, and a preference for collaborative projects. Strong candidates may have experience with LLMs, reinforcement learning, and complex shared codebases. The role requires at least 25% time in the Bay Area office.

As a public benefit corporation, Anthropic offers competitive compensation, benefits including equity donation matching, generous vacation and parental leave, flexible working hours, and a collaborative office space in San Francisco. They value diverse perspectives and encourage applications from candidates who might not meet every qualification, recognizing the social and ethical implications of their work.

Last updated a month ago

Responsibilities For Research Engineer, Alignment Science

Build and run machine learning experiments
Test robustness of safety techniques
Run multi-agent reinforcement learning experiments
Build tooling to evaluate LLM-generated jailbreaks
Write scripts and prompts for evaluation questions
Contribute to research papers, blog posts, and talks
Run experiments for AI safety efforts

Requirements For Research Engineer, Alignment Science

Python

Kubernetes

Bachelor's degree in a related field or equivalent experience
Significant software, ML, or research engineering experience
Experience contributing to empirical AI research projects
Familiarity with technical AI safety research
Ability to work collaboratively
Must be able to travel 25% to the Bay Area

Benefits For Research Engineer, Alignment Science

Visa Sponsorship

Parental Leave

Competitive compensation
Optional equity donation matching
Generous vacation
Parental leave
Flexible working hours
Office space in San Francisco
Visa sponsorship available

Anthropic

AI research company focused on creating reliable, interpretable, and steerable AI systems for safe and beneficial use.

San Francisco, CA, USA

$280,000 - $690,000

Machine Learning

Mid-Level Software Engineer

Hybrid

501 - 1,000 Employees

3+ years of experience

Interested in this job?