Software Engineer, Fleet Management

AI research and deployment company dedicated to ensuring general-purpose artificial intelligence benefits all of humanity.
$360,000 - $440,000
Cloud
Senior Software Engineer
Hybrid
5+ years of experience
AI

Description For Software Engineer, Fleet Management

OpenAI's Fleet team is seeking a Software Engineer to support their computing environment that powers cutting-edge AI research and product development. This role focuses on managing large-scale systems across data centers, GPUs, and networking infrastructure, ensuring high availability and performance. The position is integral to enabling OpenAI's models, including ChatGPT, to operate at scale.

As a Software Engineer in Fleet Management, you'll be responsible for building systems to manage hardware, configurations, and vendor interactions. The role combines deep technical expertise in cloud and bare-metal infrastructure with a focus on automation and efficiency. You'll work in a hybrid environment (3 days in office) in San Francisco, with relocation assistance provided.

The ideal candidate should have extensive experience with cluster-level systems like Kubernetes and cloud providers, along with deep knowledge of server-level systems including Linux kernels and containerization. You'll be part of a team that prioritizes safety, reliability, and responsible AI deployment over unchecked growth.

OpenAI offers a competitive compensation package ranging from $360K to $440K, plus equity and comprehensive benefits including medical insurance, 401(k) matching, and generous parental leave. This is an opportunity to shape the future of AI technology while working with cutting-edge infrastructure at scale.

The role combines technical depth with strategic thinking, requiring someone who can both architect complex systems and collaborate effectively across teams. You'll be at the forefront of AI infrastructure, helping to build and maintain the systems that power some of the most advanced AI models in the world.

Last updated 8 days ago

Responsibilities For Software Engineer, Fleet Management

  • Design and build systems to manage both cloud and bare-metal fleets at scale
  • Develop tools that integrate low-level hardware metrics with high-level job scheduling
  • Leverage LLMs to coordinate vendor operations and optimize infrastructure workflows
  • Automate infrastructure processes
  • Collaborate with hardware, infrastructure, and research teams
  • Continuously improve tools, automation, processes, and documentation

Requirements For Software Engineer, Fleet Management

Kubernetes
Linux
  • Strong software engineering skills with experience in large-scale infrastructure environments
  • Broad knowledge of cluster-level systems (Kubernetes, CI/CD pipelines, Terraform, cloud providers)
  • Deep expertise in server-level systems (systemd, containerization, Chef, Linux kernels, firmware management, host routing)
  • Experience with optimizing performance and reliability of large compute fleets
  • Ability to thrive in dynamic environments
  • Focus on automation, efficiency, and continuous improvement

Benefits For Software Engineer, Fleet Management

Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Assistance
401k
Parental Leave
Education Budget
  • Medical, dental, and vision insurance for you and your family
  • Mental health and wellness support
  • 401(k) plan with 50% matching
  • Generous time off and company holidays
  • 24 weeks paid birth-parent leave & 20-week paid parental leave
  • Annual learning & development stipend ($1,500 per year)
  • Equity

Interested in this job?

Jobs Related To OpenAI Software Engineer, Fleet Management

Software Engineer, Infrastructure

Senior Infrastructure Engineer role at OpenAI, building and maintaining core infrastructure for ChatGPT and API products, offering competitive compensation and benefits.

Software Engineer, Infrastructure

Senior Infrastructure Engineer role at OpenAI, building and maintaining core infrastructure for products like ChatGPT, offering $160K-$385K plus equity and comprehensive benefits.

Construction Manager, Data Centers

Senior Construction Manager role at AWS managing data center construction projects, requiring 6+ years of experience in construction management and MEP systems.

Software Dev Engineer III, ICON

Senior Software Engineer role at Amazon's ICON team, developing cloud hosting platform technologies and managing infrastructure for critical services.

Solutions Architect, AWS Industries - Automotive & Manufacturing

Senior Solutions Architect role at AWS focusing on automotive industry cloud solutions, combining technical expertise with customer advocacy.