Senior On-Device Model Inference Optimization Engineer

World leader in accelerated computing, pioneering AI and digital twins technology.
$220,000 - $339,250
Machine Learning
Senior Software Engineer
Hybrid
10+ years of experience
AI · Automotive

Description For Senior On-Device Model Inference Optimization Engineer

NVIDIA, a global leader in accelerated computing and AI technology, is seeking a Senior On-Device Model Inference Optimization Engineer to drive innovation in autonomous vehicle technology. This role combines cutting-edge AI optimization with practical implementation, requiring expertise in both theoretical machine learning and hands-on engineering.

The position demands a seasoned professional with 10+ years of experience who can lead efforts in improving the performance and efficiency of AI models. You'll be working with state-of-the-art technologies including CUDA, PyTorch, ONNX, and TensorRT, while implementing sophisticated optimization techniques such as pruning, quantization, and knowledge distillation.

As part of NVIDIA's team, you'll be at the forefront of developing solutions that power the next generation of autonomous vehicles. The role offers a competitive base salary range of $220,000 - $339,250, along with equity and comprehensive benefits. You'll be working in a collaborative environment that values innovation, precision, and technical excellence.

The ideal candidate will bring a combination of strong technical skills in machine learning optimization, programming proficiency in multiple languages, and the ability to work effectively across multidisciplinary teams. This is an opportunity to make a lasting impact on the future of autonomous vehicle technology while working with one of the most innovative companies in the field.

NVIDIA's commitment to diversity and inclusion, combined with their legacy of technological innovation, makes this an excellent opportunity for someone looking to push the boundaries of what's possible in AI and autonomous systems.

Last updated a month ago

Responsibilities For Senior On-Device Model Inference Optimization Engineer

  • Develop and implement strategies to optimize AI model inference for on-device deployment
  • Employ techniques like pruning, quantization, and knowledge distillation
  • Optimize performance-critical components using CUDA and C++
  • Collaborate with multi-functional teams
  • Benchmark inference performance and identify bottlenecks
  • Research and apply innovative methods for inference optimization
  • Adapt models for diverse hardware platforms
  • Create tools to validate accuracy and latency of deployed models
  • Recommend and implement model architecture changes

Requirements For Senior On-Device Model Inference Optimization Engineer

Python
Kubernetes
  • MSc or PhD in Computer Science, Engineering, or related field
  • Over 5 years of experience in model inference and optimization
  • 10+ years of work experience in relevant area
  • Expertise in PyTorch, ONNX, and TensorRT
  • Experience in optimizing inference for transformer and convolutional architectures
  • Strong programming proficiency in CUDA, Python, and C++
  • In-depth knowledge of optimization techniques
  • Skilled in building and deploying scalable cloud-based inference systems
  • Strong collaboration and communication skills
  • Meticulous attention to detail

Benefits For Senior On-Device Model Inference Optimization Engineer

Equity
  • Equity
  • Comprehensive benefits package

Interested in this job?

Jobs Related To NVIDIA Senior On-Device Model Inference Optimization Engineer

Senior Software Engineer - Conversational AI

Senior Software Engineer position at NVIDIA focusing on building next-generation Conversational AI systems and Digital Human solutions using advanced Speech and LLM models.

Senior Software Engineer, Deep Learning Inference

Senior Software Engineer role at NVIDIA focusing on optimizing deep learning inference performance and implementing AI runtime solutions.

Senior System Software Engineer, Deep Learning Accelerator

Senior System Software Engineer role at NVIDIA focusing on Deep Learning Accelerator development, requiring 7+ years of experience in low-level software development and system architecture.

Deep Learning Engineer, End-to-end - Autonomous Driving

Senior Deep Learning Engineer position at NVIDIA focusing on end-to-end autonomous driving solutions, combining AI expertise with automotive technology.

Senior Software Engineer, TensorRT-LLM

Senior Software Engineer position at NVIDIA focusing on TensorRT-LLM development, requiring expertise in C++, deep learning, and AI inferencing optimization.