Research Scientist/Engineer - Finetuning Alignment

Anthropic

AI research company focused on creating reliable, interpretable, and steerable AI systems for safe and beneficial use.

San Francisco, CA, USA

$280,000 - $425,000

Machine Learning

Staff Software Engineer

Hybrid

5+ years of experience

Description For Research Scientist/Engineer - Finetuning Alignment

Anthropic is seeking a Research Scientist/Engineer to join their Finetuning Alignment team, focusing on developing AI systems that are reliable, truthful, and aligned with human values. This role combines cutting-edge AI research with practical engineering to minimize hallucinations and enhance truthfulness in language models.

The position offers an opportunity to work on significant challenges in AI safety and ethics, developing novel techniques for model truthfulness and accuracy. You'll be part of a collaborative team that approaches AI research as an empirical science, similar to physics and biology. The role involves creating sophisticated data curation pipelines, developing evaluation frameworks, and implementing retrieval-augmented generation systems.

Anthropic offers a competitive compensation package ranging from $280,000 to $425,000 USD, along with comprehensive benefits including flexible working hours and parental leave. The company maintains a hybrid work environment in San Francisco, requiring at least 25% office presence.

The ideal candidate will have advanced degrees in Computer Science or ML, strong Python skills, and experience with language model finetuning. They should be passionate about AI safety and have a track record in developing systems for model accuracy and truthfulness. The role presents an excellent opportunity to contribute to ensuring AI systems remain reliable and ethical while advancing the field of AI safety.

Working at Anthropic means joining a cohesive team focused on large-scale research efforts rather than smaller, specific puzzles. The company values impact and collaboration, with frequent research discussions and a strong emphasis on communication skills. This is a chance to work on meaningful problems that could shape the future of AI development.

Last updated 25 days ago

Responsibilities For Research Scientist/Engineer - Finetuning Alignment

Design and implement novel data curation pipelines for training data accuracy
Develop specialized classifiers to detect potential hallucinations
Create and maintain comprehensive honesty benchmarks and evaluation frameworks
Implement search and retrieval-augmented generation (RAG) systems
Design and deploy human feedback collection systems
Design and implement prompting pipelines for model accuracy
Develop and test novel RL environments for truthful outputs
Create tools for human evaluators to assess model outputs

Requirements For Research Scientist/Engineer - Finetuning Alignment

Python

MS/PhD in Computer Science, ML, or related field
Strong programming skills in Python
Industry experience with language model finetuning and classifier training
Proficiency in experimental design and statistical analysis
Experience in data science or dataset curation for finetuning LLMs
Understanding of uncertainty, calibration, and truthfulness metrics
Bachelor's degree in a related field or equivalent experience

Benefits For Research Scientist/Engineer - Finetuning Alignment

Visa Sponsorship

Parental Leave

Competitive compensation and benefits
Optional equity donation matching
Generous vacation and parental leave
Flexible working hours
Office space for collaboration

Anthropic

AI research company focused on creating reliable, interpretable, and steerable AI systems for safe and beneficial use.

San Francisco, CA, USA

$280,000 - $425,000

Machine Learning

Staff Software Engineer

Hybrid

5+ years of experience

Interested in this job?

Jobs Related To Anthropic Research Scientist/Engineer - Finetuning Alignment

Research Scientist/Engineer - Finetuning Alignment

Anthropic

Research Scientist/Engineer position at Anthropic focusing on developing truthful and reliable AI systems through advanced finetuning and alignment techniques.

Developer Relations Lead

Anthropic

Lead Developer Relations at Anthropic, shaping how developers experience and build with Claude AI through technical programs, events, and community engagement.

Interpretability Research Engineer

Anthropic

Senior research engineering role at Anthropic focusing on AI interpretability and safety, offering competitive compensation and the opportunity to work on cutting-edge AI systems.

Developer Relations Lead

Anthropic

Lead Developer Relations at Anthropic, shaping how developers experience and build with Claude AI through technical programs, events, and community engagement.

ML Engineering Manager - Trust & Safety

Anthropic

Lead an Applied ML team in Trust & Safety at Anthropic, developing AI-driven detection models and implementing safety measures for AI services.