Machine Learning Performance Engineer

Wayve is the leading developer of Embodied AI technology, creating advanced AI software and foundation models for autonomous vehicles.
Machine Learning
Senior Software Engineer
Hybrid
5+ years of experience
AI · Automotive

Description For Machine Learning Performance Engineer

Wayve, founded in 2017, is at the forefront of Embodied AI technology for autonomous vehicles. We're seeking a Machine Learning Performance Engineer to join our Machine Learning Platform team, focusing on optimizing large-scale training jobs as we scale our models.

Key responsibilities include:

  • Maximizing the MFU of large-scale training jobs
  • Profiling and identifying bottlenecks in training code
  • Implementing GPU kernels to improve training throughput
  • Collaborating with Research teams to integrate and test efficiency improvements
  • Managing and enhancing our GPU training clusters

The ideal candidate will have:

  • 5+ years of experience in performance optimization or ML engineering
  • Expertise in optimizing large-scale training jobs on GPU compute clusters
  • Experience working in platform teams and with research teams
  • Proficiency in benchmarking and reporting performance metrics
  • Strong Python coding skills
  • BS or MS in Machine Learning, Computer Science, Engineering, or related field

Desirable skills include experience with concurrent and distributed computing, Nvidia NSight Systems, GPU kernel implementation, and a deep understanding of computing fundamentals.

At Wayve, we value diversity and inclusivity. We offer a hybrid working model, combining office time for innovation and collaboration with the flexibility of working from home. Join us in creating autonomy that propels the world forward!

Last updated 3 months ago

Responsibilities For Machine Learning Performance Engineer

  • Maximising the MFU of our large scale training jobs
  • Profiling and identifying bottlenecks in training code
  • Implementing GPU kernels to improve training throughput
  • Working closely with Research teams to integrate and test training efficiency improvements
  • Owning and improving our GPU training clusters

Requirements For Machine Learning Performance Engineer

Python
Linux
  • 5+ years experience in performance optimization or ML engineering
  • Experience optimize large scale training jobs on GPU compute clusters
  • Experience in working in platform teams and working with research teams
  • Experience in reporting and tracking over time benchmarked performance in an open and accessible way
  • Ability to write high quality, well-structured and tested Python code
  • BS or MS in Machine Learning, Computer Science, Engineering, or a related technical discipline or equivalent experience

Interested in this job?

Jobs Related To Wayve Machine Learning Performance Engineer

Senior Machine Learning Platform Engineer

Senior Machine Learning Platform Engineer optimizing large-scale AI training for autonomous vehicles

Senior Machine Learning Engineer

Senior Machine Learning Engineer role at Wayve, developing cutting-edge AI solutions for autonomous driving in London.

Applied Scientist

Senior Applied Scientist role at Amazon SageMaker, leading automated ML systems development with focus on innovation and practical implementation.

Machine Learning Engineer III, FAR (Frontier AI & Robotics)

Senior ML Engineer role at Amazon's Frontier AI & Robotics team, optimizing foundation models for robotics applications with industry leaders.

AIML - ML Engineer, Safety & Red Teaming

Senior ML Engineer role at Apple focusing on AI safety and red teaming for generative models, offering competitive compensation and the opportunity to shape the future of responsible AI development.