Distributed Training Engineer, Sora

OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity.
$295,000 - $440,000
Machine Learning
Staff Software Engineer
Hybrid
1,000 - 5,000 Employees
7+ years of experience

Description For Distributed Training Engineer, Sora

The Sora team at OpenAI is working on making video a key capability of OpenAI's foundation models. As a Distributed Training Engineer for Sora, you will work on improving the training throughput for our internal training framework and enable researchers to experiment with new ideas. This role requires strong engineering skills, the ability to write bug-free machine learning code, and deep knowledge of supercomputer performance.

Key responsibilities include:

  • Collaborating with researchers to develop systems-efficient video models and architectures
  • Applying the latest techniques to achieve impressive hardware efficiency for training runs
  • Profiling and optimizing the training framework

The ideal candidate should have experience with multi-modal ML pipelines, strong software engineering skills (particularly in Python), experience with understanding and optimizing training kernels, and a passion for understanding stable training dynamics.

OpenAI offers a competitive compensation package, including a salary range of $295K – $440K, generous equity, and comprehensive benefits such as medical insurance, mental health support, 401(k) matching, unlimited time off, and paid parental leave.

This role is based in San Francisco, CA, with a hybrid work model of 3 days in the office per week. OpenAI is committed to diversity, equality, and creating an inclusive environment for all employees.

Last updated 5 months ago

Responsibilities For Distributed Training Engineer, Sora

  • Collaborate with researchers to enable them to develop systems-efficient video models and architectures
  • Apply the latest techniques to our internal training framework to achieve impressive hardware efficiency for our training runs
  • Profile and optimize our training framework

Requirements For Distributed Training Engineer, Sora

Python
  • Experience working with multi-modal ML pipelines
  • Strong software engineering skills and proficiency in Python
  • Experience understanding and optimizing training kernels
  • Passion for understanding stable training dynamics
  • Ability to dive deep into systems implementations to improve performance and maintainability

Benefits For Distributed Training Engineer, Sora

Equity
Medical Insurance
Dental Insurance
Vision Insurance
401k
Education Budget
Parental Leave
Mental Health Assistance
  • Medical, dental, and vision insurance for you and your family
  • Mental health and wellness support
  • 401(k) plan with 50% matching
  • Unlimited time off and 13 company holidays per year
  • Paid parental leave (20 weeks) and family-planning support
  • Annual learning & development stipend ($1,500 per year)
  • Equity

Interested in this job?

Jobs Related To OpenAI Distributed Training Engineer, Sora

Post-training - Model Fusion Research Engineer

OpenAI seeks a Post-training Model Fusion Research Engineer to enhance ChatGPT's capabilities and lead deployment improvements.

AIML - Senior Data Science Manager, AIML Data

Senior Data Science Manager position at Apple focusing on AIML Data, leading evaluation and analytics for products like Siri and Search.

AIML - Sr Engineering Program Manager, NLU Evaluation

Senior Engineering Program Manager position at Apple, leading NLU technology development for products like Siri, requiring 6+ years of ML product experience.

Senior Engineering Manager, Incubation Team

Lead Apple's ISE Incubation team in Seattle, driving innovation in ML and user experience, managing diverse technical teams, and shaping next-generation products.

Senior Product Manager, Video Relevance

Senior Product Manager role at LinkedIn focusing on Video Relevance and AI-driven content ranking, offering competitive compensation and hybrid work options.