Research Scientist/Engineer - Finetuning Alignment

AI research company focused on creating reliable, interpretable, and steerable AI systems for safe and beneficial use.
$280,000 - $425,000
Machine Learning
Staff Software Engineer
Hybrid
5+ years of experience
AI

Description For Research Scientist/Engineer - Finetuning Alignment

Anthropic is seeking a Research Scientist/Engineer to join their Finetuning Alignment team, focusing on developing AI systems that are reliable, truthful, and aligned with human values. This role combines cutting-edge AI research with practical engineering to minimize hallucinations and enhance truthfulness in language models.

The position offers an opportunity to work on significant challenges in AI safety and ethics, developing novel techniques for model truthfulness and accuracy. You'll be part of a collaborative team that approaches AI research as an empirical science, similar to physics and biology. The role involves creating sophisticated data curation pipelines, developing evaluation frameworks, and implementing retrieval-augmented generation systems.

Anthropic offers a competitive compensation package ranging from $280,000 to $425,000 USD, along with comprehensive benefits including flexible working hours and parental leave. The company maintains a hybrid work environment in San Francisco, requiring at least 25% office presence.

The ideal candidate will have advanced degrees in Computer Science or ML, strong Python skills, and experience with language model finetuning. They should be passionate about AI safety and have a track record in developing systems for model accuracy and truthfulness. The role presents an excellent opportunity to contribute to ensuring AI systems remain reliable and ethical while advancing the field of AI safety.

Working at Anthropic means joining a cohesive team focused on large-scale research efforts rather than smaller, specific puzzles. The company values impact and collaboration, with frequent research discussions and a strong emphasis on communication skills. This is a chance to work on meaningful problems that could shape the future of AI development.

Last updated 25 days ago

Responsibilities For Research Scientist/Engineer - Finetuning Alignment

  • Design and implement novel data curation pipelines for training data accuracy
  • Develop specialized classifiers to detect potential hallucinations
  • Create and maintain comprehensive honesty benchmarks and evaluation frameworks
  • Implement search and retrieval-augmented generation (RAG) systems
  • Design and deploy human feedback collection systems
  • Design and implement prompting pipelines for model accuracy
  • Develop and test novel RL environments for truthful outputs
  • Create tools for human evaluators to assess model outputs

Requirements For Research Scientist/Engineer - Finetuning Alignment

Python
  • MS/PhD in Computer Science, ML, or related field
  • Strong programming skills in Python
  • Industry experience with language model finetuning and classifier training
  • Proficiency in experimental design and statistical analysis
  • Experience in data science or dataset curation for finetuning LLMs
  • Understanding of uncertainty, calibration, and truthfulness metrics
  • Bachelor's degree in a related field or equivalent experience

Benefits For Research Scientist/Engineer - Finetuning Alignment

Visa Sponsorship
Parental Leave
  • Competitive compensation and benefits
  • Optional equity donation matching
  • Generous vacation and parental leave
  • Flexible working hours
  • Office space for collaboration

Interested in this job?

Jobs Related To Anthropic Research Scientist/Engineer - Finetuning Alignment

Research Scientist/Engineer - Finetuning Alignment

Research Scientist/Engineer position at Anthropic focusing on developing truthful and reliable AI systems through advanced finetuning and alignment techniques.

Developer Relations Lead

Lead Developer Relations at Anthropic, shaping how developers experience and build with Claude AI through technical programs, events, and community engagement.

Interpretability Research Engineer

Senior research engineering role at Anthropic focusing on AI interpretability and safety, offering competitive compensation and the opportunity to work on cutting-edge AI systems.

Developer Relations Lead

Lead Developer Relations at Anthropic, shaping how developers experience and build with Claude AI through technical programs, events, and community engagement.

ML Engineering Manager - Trust & Safety

Lead an Applied ML team in Trust & Safety at Anthropic, developing AI-driven detection models and implementing safety measures for AI services.