Software Engineer, Evals Infrastructure (Preparedness)

OpenAI

AI research and deployment company dedicated to ensuring general-purpose artificial intelligence benefits humanity

San Francisco, CA, USA

$310,000

Site Reliability

Staff Software Engineer

In-Person

7+ years of experience

Description For Software Engineer, Evals Infrastructure (Preparedness)

OpenAI is seeking a Software Engineer for their Evals Infrastructure team within the Safety Systems department. This role is crucial for ensuring the safe deployment of AI models and maintaining infrastructure reliability. The position sits within the Preparedness team, which focuses on identifying and mitigating risks associated with frontier AI models.

The role combines site reliability engineering with AI safety, requiring expertise in scaling infrastructure and implementing robust monitoring systems. You'll be responsible for maintaining and enhancing system stability while supporting OpenAI's mission to develop safe AGI. The position offers a competitive salary of $310,000 plus equity and comprehensive benefits.

Key responsibilities include scaling evaluation infrastructure, implementing monitoring systems, and maintaining service level objectives. You'll work closely with cross-functional teams, participating in on-call rotations and production readiness reviews. The ideal candidate brings 7+ years of software engineering experience, strong cloud infrastructure knowledge, and expertise with tools like Kubernetes and observability platforms.

This San Francisco-based position offers an opportunity to work at the forefront of AI development while ensuring system reliability and safety. You'll join a team dedicated to preparing for and mitigating risks associated with increasingly capable AI systems. The role combines technical expertise with the broader mission of ensuring AI benefits humanity safely and effectively.

OpenAI provides comprehensive benefits including medical insurance, mental health support, generous parental leave, and learning opportunities. They foster an inclusive culture and are committed to considering diverse perspectives in AI development.

Last updated 13 days ago

Responsibilities For Software Engineer, Evals Infrastructure (Preparedness)

Work on scaling infrastructure to support evaluations, supporting systems and automation
Collaborate with development teams to make systems more reliable
Implement and manage monitoring systems
Develop and maintain service level objectives (SLOs) and indicators (SLIs)
Implement fault-tolerant and resilient design patterns
Build and maintain automation tools
Partner with engineers and researchers
Participate in on-call rotation

Requirements For Software Engineer, Evals Infrastructure (Preparedness)

Kubernetes

Linux

Bachelor's degree in Computer Science, Information Technology, or related field
7+ years of professional software engineering experience
Experience as a reliability engineer in a fast-paced company
Strong proficiency in cloud infrastructure
Proficiency in programming/scripting languages
Experience with containerization and Kubernetes
Knowledge of Infrastructure as Code tools
Experience with observability tools (DataDog, Prometheus, Grafana, Splunk, ELK stack)
Experience with microservices architecture
Knowledge of security best practices in cloud environments

Benefits For Software Engineer, Evals Infrastructure (Preparedness)

Medical Insurance

Dental Insurance

Vision Insurance

Mental Health Assistance

401k

Parental Leave

Education Budget

Medical, dental, and vision insurance for you and your family
Mental health and wellness support
401(k) plan with 50% matching
Generous time off and company holidays
24 weeks paid birth-parent leave & 20-week paid parental leave
Annual learning & development stipend ($1,500 per year)
Equity compensation
Relocation assistance

OpenAI

AI research and deployment company dedicated to ensuring general-purpose artificial intelligence benefits humanity

San Francisco, CA, USA

$310,000

Site Reliability

Staff Software Engineer

In-Person

7+ years of experience

Interested in this job?

Jobs Related To OpenAI Software Engineer, Evals Infrastructure (Preparedness)

Software Engineer, Reliability

OpenAI

OpenAI is seeking a Software Engineer, Reliability to ensure system scalability, reliability, and performance as the company grows.

Sr Staff Software Engineer, Reliability Engineering

Airbnb

Senior Staff SRE position at Airbnb focusing on reliability architecture, incident management, and technical leadership, offering competitive compensation and remote work flexibility.

Staff Software Engineer, Reliability Engineering

Airbnb

Staff Software Engineer position at Airbnb focusing on Site Reliability Engineering, developing and maintaining tools for service reliability at scale.

Lead Site Reliability Engineer

Wellhub

Lead SRE position at Wellhub, focusing on cloud infrastructure, Kubernetes, and DevOps practices, offering hybrid work and comprehensive benefits.

Senior Site Reliability Developer (JoinOCI-Ns2)

Oracle

Senior SRE role at Oracle focusing on cloud infrastructure, automation, and system reliability with competitive benefits and security clearance requirement.