Staff Software Engineer, Interpretability

Anthropic creates reliable, interpretable, and steerable AI systems, focusing on safe and beneficial AI development for users and society.
$315,000 - $560,000
Machine Learning
Staff Software Engineer
Hybrid
5+ years of experience
AI

Description For Staff Software Engineer, Interpretability

Anthropic is seeking a Staff Software Engineer to join their Interpretability team, focusing on creating safe and beneficial AI systems. The role involves working on mechanistic interpretability to understand how neural networks function, similar to doing "biology" or "neuroscience" of neural networks. The team recently achieved significant breakthroughs with Claude 3.0 Sonnet model, extracting millions of meaningful features and demonstrating behavior modification capabilities.

The position requires 5-10+ years of software development experience and offers a competitive salary range of $315,000 to $560,000 USD. The role combines technical expertise with research collaboration, requiring proficiency in languages like Python, Rust, Go, or Java. You'll work on implementing research experiments, optimizing workflows, and building tools for AI safety improvements.

The team operates in a hybrid work environment from their San Francisco office, with at least 25% in-office presence required. Anthropic offers comprehensive benefits including equity options, visa sponsorship, flexible hours, and generous leave policies. The company values diversity and encourages applications from candidates with varied perspectives and backgrounds.

As part of a cohesive team working on large-scale research efforts, you'll contribute to projects like optimizing sparse autoencoders across GPUs and building visualization tools for millions of features. The role emphasizes collaboration with researchers and other teams across Anthropic, including Alignment Science and Societal Impacts, to enhance model safety.

This position offers a unique opportunity to work at the forefront of AI safety and interpretability research, contributing to the understanding and development of trustworthy AI systems. The work directly impacts the safety and reliability of AI models like Claude, making it an ideal role for those passionate about responsible AI development and its societal implications.

Last updated 2 days ago

Responsibilities For Staff Software Engineer, Interpretability

  • Implement and analyze research experiments in toy scenarios and large models
  • Set up and optimize research workflows for large-scale operations
  • Build tools and abstractions for rapid research experimentation
  • Develop tools and infrastructure to support teams in using Interpretability's work for model safety

Requirements For Staff Software Engineer, Interpretability

Python
Rust
Go
Java
  • 5-10+ years of software building experience
  • Highly proficient in at least one programming language (Python, Rust, Go, Java)
  • Experience with empirical AI research projects
  • Strong ability to prioritize and direct effort toward impactful work
  • Comfortable with ambiguity and questioning assumptions
  • Preference for fast-moving collaborative projects
  • Interest in machine learning research and applications
  • Care about societal impacts and ethics of work

Benefits For Staff Software Engineer, Interpretability

Visa Sponsorship
Equity
  • Competitive compensation and benefits
  • Optional equity donation matching
  • Generous vacation and parental leave
  • Flexible working hours
  • Office space in San Francisco

Interested in this job?

Jobs Related To Anthropic Staff Software Engineer, Interpretability

Research Engineer

Research Engineer for Anthropic's Pretraining team, developing safe and ethical large language models.

Research Engineer / Scientist, Alignment Science

Join Anthropic as a Research Engineer / Scientist in Alignment Science to work on AI safety and steer powerful AI systems towards beneficial outcomes.

Machine Learning Engineering Manager

Lead ML engineering team at PayJoy, developing and deploying models for fraud detection and credit risk in emerging markets.

Lead MLOps Engineer

Lead MLOps Engineer position at Multiverse, focusing on developing and maintaining ML pipelines and infrastructure for an innovative apprenticeship platform.

Machine Learning/DSP Engineer

Senior Machine Learning/DSP Engineer position at Apple, focusing on developing sensing technologies and algorithms for Human Interface Devices and Health Sensing products.