Evals Platform Engineer

Apollo Research focuses on behavioral model evaluations and AI safety, specifically addressing deceptive alignment in AI systems.
$80,000 - $150,000
DevOps
Senior Software Engineer
In-Person
11 - 50 Employees
5+ years of experience
AI · Cybersecurity

Description For Evals Platform Engineer

Apollo Research is at the forefront of AI safety, focusing on evaluating and auditing real-world AI models to address critical challenges like deceptive alignment. As a Platform Engineer, you'll join a dynamic team working on frontier AI evaluation research, with a strong emphasis on security. You'll have broad decision-making authority on the infrastructure stack, designing and maintaining the systems that researchers depend on daily.

The role combines infrastructure expertise with hands-on development, working alongside software engineers and research scientists to ensure scalable, secure, and efficient systems. You'll be part of the Evals team, collaborating with experienced professionals like Rusheb Shah and Andrei Matveiakin, while interacting with the broader research team.

Key projects include building job orchestration systems for LLM evals, implementing secure databases, managing permissions structures, and maintaining secure development environments. The position offers significant autonomy in technical decisions while contributing to crucial AI safety research.

The company culture emphasizes truth-seeking, goal-oriented work, and constructive feedback. Located in London and sharing space with LISA offices, the role offers competitive compensation, flexible working arrangements, and comprehensive benefits. Apollo Research values diversity and provides equal opportunities to all candidates.

This is an excellent opportunity for an experienced infrastructure specialist who wants to make a meaningful impact in AI safety while working with cutting-edge technology and a talented, dedicated team.

Last updated 8 hours ago

Responsibilities For Evals Platform Engineer

  • Design, implement, scale, and maintain infrastructure for running frontier LLM evals
  • Work closely with software engineers and researchers to understand and address infrastructure needs
  • Choose and integrate appropriate technologies for our infrastructure stack
  • Administer and secure internal AWS accounts
  • Enforce security best practices
  • Manage IAM permissions and access control
  • Manage CI/CD pipelines
  • Design and build data storage systems for evaluation results
  • Help set up and manage organisation-wide security processes
  • Contribute to development of internal software tools that leverage our infrastructure

Requirements For Evals Platform Engineer

Python
Kubernetes
Linux
  • Experience leading infrastructure projects from start to finish
  • Experience implementing security best practices for cloud and containerized environments
  • Solid knowledge of AWS, including IAM and EKS
  • Strong hands-on experience with Kubernetes
  • Experience with Infrastructure as Code tools
  • Strong software engineering skills, preferably in Python
  • Ability to work well with researchers and understand their technical needs

Benefits For Evals Platform Engineer

Education Budget
Visa Sponsorship
  • Flexible work hours and schedule
  • Unlimited vacation
  • Unlimited sick leave
  • Lunch, dinner, and snacks provided on workdays
  • Paid work trips, including staff retreats, business trips, and conferences
  • Yearly $1,000 USD professional development budget
  • Visa sponsorship available
  • Up to £10,000 relocation support through AI Futures Grants

Interested in this job?

Jobs Related To Apollo Research Evals Platform Engineer

Senior Software Engineer, Infrastructure Engineering

Senior Software Engineer position at StackAdapt focusing on infrastructure engineering, cloud architecture, and DevOps practices in a remote-first environment.

Sr. DevOps Engineer - .NET

Senior DevOps Engineer position at FreedomPay, focusing on .NET applications and infrastructure management with emphasis on CI/CD, Kubernetes, and Windows Server administration.

Sr. Ops Engineer, MSP CX-DR

Senior Operations Engineer role at Amazon focusing on deploying and managing automated packaging solutions with extensive travel requirements and competitive compensation.

Customer Success Engineer - Test Hub

Senior Customer Success Engineer position at SmartBear, focusing on test automation, customer education, and technical implementation of testing solutions.

Customer Success Engineer - Test Hub

Senior Customer Success Engineer position at SmartBear, focusing on test automation, customer education, and technical implementation of testing solutions.