OpenAI is seeking a Software Engineer for Model Inference to join their Applied AI Engineering team in San Francisco. This role is crucial for scaling up critical inference infrastructure that efficiently services customer requests for state-of-the-art AI models like GPT-4 and DALL-E.
Key responsibilities include:
- Collaborating with ML researchers, engineers, and product managers to productionize latest technologies
- Implementing new techniques, tools, and architecture to enhance model performance, latency, throughput, and efficiency
- Developing tools for identifying bottlenecks and instability sources, then designing and implementing solutions
- Optimizing code and Azure VM fleet to maximize hardware utilization
Ideal candidates should have:
- Understanding of modern ML architectures and optimization for inference
- End-to-end problem-solving skills
- At least 3 years of professional software engineering experience
- Expertise in HPC technologies (InfiniBand, MPI, CUDA)
- Experience with production distributed systems
- Self-direction and ability to identify important problems
- Humble attitude and eagerness to help colleagues
OpenAI offers a competitive salary range of $200K – $370K and is committed to diversity, equal opportunity, and providing reasonable accommodations to applicants with disabilities.
Join OpenAI in shaping the future of AI technology and ensuring its benefits are widely shared.