Research Engineer, Pretraining

Anthropic is at the forefront of AI research, dedicated to developing safe, ethical, and powerful artificial intelligence.
$300,000 - $340,000
Data
Staff Software Engineer
Hybrid
51 - 100 Employees
5+ years of experience
AI

Description For Research Engineer, Pretraining

Anthropic is a leading AI research company focused on creating reliable, interpretable, and steerable AI systems. Our mission is to ensure that transformative AI systems are aligned with human interests and are safe and beneficial for users and society as a whole. We are seeking a Research Engineer to join our Pretraining team, responsible for developing the next generation of large language models.

In this role, you will work at the intersection of cutting-edge research and practical engineering, contributing to the development of safe, steerable, and trustworthy AI systems. Key responsibilities include designing and implementing high-performance data processing infrastructure for large language model training, developing core processing primitives, building robust systems for data quality assurance, implementing monitoring systems, and creating optimized distributed computing systems for processing web-scale datasets.

The ideal candidate will have strong software engineering skills, expertise in Python and distributed computing frameworks, deep understanding of cloud computing platforms, experience with high-throughput system design, and excellent problem-solving and communication skills. Preferred qualifications include an advanced degree in Computer Science, experience with language model training infrastructure, and expertise in tokenization algorithms.

At Anthropic, we work as a single cohesive team on large-scale research efforts, valuing impact over smaller, specific puzzles. We view AI research as an empirical science and greatly value communication skills. Our work continues in the directions of GPT-3, Circuit-Based Interpretability, Multimodal Neurons, Scaling Laws, AI & Compute, Concrete Problems in AI Safety, and Learning from Human Preferences.

We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in San Francisco. We are committed to fostering a diverse and inclusive workplace and strongly encourage applications from candidates of all backgrounds, including those from underrepresented groups in tech.

If you're passionate about pushing the boundaries of AI while prioritizing safety and ethics, we want to hear from you!

Last updated 18 days ago

Responsibilities For Research Engineer, Pretraining

  • Design and implement high-performance data processing infrastructure for large language model training
  • Develop and maintain core processing primitives (e.g., tokenization, deduplication, chunking) with a focus on scalability
  • Build robust systems for data quality assurance and validation at scale
  • Implement comprehensive monitoring systems for data processing infrastructure
  • Create and optimize distributed computing systems for processing web-scale datasets
  • Collaborate with research teams to implement novel data processing architectures
  • Build and maintain documentation for infrastructure components and systems
  • Design and implement systems for reproducibility and traceability in data preparation

Requirements For Research Engineer, Pretraining

Python
  • Strong software engineering skills with experience in building distributed systems
  • Expertise in Python and experience with distributed computing frameworks
  • Deep understanding of cloud computing platforms and distributed systems architecture
  • Experience with high-throughput, fault-tolerant system design
  • Strong background in performance optimization and system scaling
  • Excellent problem-solving skills and attention to detail
  • Strong communication skills and ability to work in a collaborative environment

Benefits For Research Engineer, Pretraining

Equity
Visa Sponsorship
  • Competitive compensation
  • Optional equity donation matching
  • Generous vacation
  • Parental leave
  • Flexible working hours
  • Office space in San Francisco

Interested in this job?

Jobs Related To Anthropic Research Engineer, Pretraining

Staff Software Engineer, Migration Service, Google BigQuery

Lead the development of Google BigQuery's migration services, designing and implementing large-scale data infrastructure solutions for enterprise customers.

Technical Lead, Data Engineering Solutions

Lead data engineering solutions at Google, developing and maintaining data architecture and analytics technologies for corporate functions.

Cloud Business Intelligence Architect

Senior cloud architecture role at Google Cloud focusing on business intelligence solutions, requiring 10 years of experience and offering competitive compensation.

Data Scientist Manager II, Product

Lead data science initiatives and team at Google, driving product decisions through analytics with competitive compensation and benefits.

Senior Data Scientist Manager, Product, LearnX

Lead data science initiatives at Google as Senior Data Scientist Manager, driving product decisions through analytics and team leadership in New York.