OpenAI's Fleet team is seeking a Software Engineer to support their computing environment that powers cutting-edge AI research and product development. This role focuses on managing large-scale systems across data centers, GPUs, and networking infrastructure to ensure high availability, performance, and efficiency. The position is critical for maintaining operational excellence as OpenAI's data center capabilities expand.
The ideal candidate will apply fleet management techniques to real-world scenarios, ensuring operational stability and performance across OpenAI's infrastructure. They will be responsible for developing monitoring systems, optimizing resource utilization, and automating operational workflows. The role requires strong software engineering skills, experience with infrastructure operations, and a passion for system performance and reliability.
Working in a hybrid model (3 days in office) in San Francisco, you'll collaborate with research, engineering, and infrastructure teams to align operational practices with evolving compute needs. OpenAI offers comprehensive benefits including competitive salary ($310K-$465K), equity, medical benefits, 401(k) matching, generous parental leave, and learning opportunities.
OpenAI is committed to ensuring AI benefits humanity, emphasizing safety and responsible deployment over unchecked growth. They value diversity, provide equal opportunities, and offer reasonable accommodations to applicants with disabilities. Join OpenAI in shaping the future of technology while working on systems that power groundbreaking products like ChatGPT.