Deep Learning Performance Architect

NVIDIA is the world leader in accelerated computing, pioneering solutions in AI and digital twins.
Machine Learning
Senior Software Engineer
In-Person
5+ years of experience
AI

Description For Deep Learning Performance Architect

NVIDIA, the world leader in accelerated computing, is seeking a Deep Learning Performance Architect to join their innovative team. This role focuses on developing GPU-accelerated Deep Learning software and optimizing deep learning kernels for inference. The position offers an opportunity to work at the forefront of AI technology, collaborating with researchers worldwide who are using NVIDIA GPUs to power breakthroughs in numerous areas.

The role involves working with cutting-edge deep learning technologies, specifically optimizing performance through TensorRT implementation. You'll be part of a fast-paced, customer-oriented team, developing solutions that impact various sectors including automotive, image understanding, and speech processing. The position requires strong technical expertise in C/C++ programming, GPU architecture, and deep learning frameworks.

As a Deep Learning Performance Architect, you'll contribute to NVIDIA's mission of transforming industries through AI and digital twins. The role offers exposure to the latest developments in AI and machine learning, with opportunities to attend conferences and engage with customers for technical consultation. NVIDIA's reputation as one of technology's most desirable employers, combined with the chance to work alongside brilliant minds in the field, makes this an exceptional opportunity for those passionate about advancing AI technology.

The ideal candidate will bring a combination of strong academic credentials (Masters or PhD) in computer science or related fields, extensive programming experience, and a deep understanding of GPU architecture and optimization techniques. This role represents a unique opportunity to shape the future of AI acceleration while working with state-of-the-art technology at a company that's driving innovation in multiple industries.

Last updated a day ago

Responsibilities For Deep Learning Performance Architect

  • Develop highly optimized deep learning kernels for inference
  • Perform performance optimization, analysis, and tuning
  • Work with cross-collaborative teams across automotive, image understanding, and speech understanding
  • Travel to conferences and customers for technical consultation and training

Requirements For Deep Learning Performance Architect

Python
  • Masters or PhD or equivalent experience in relevant discipline (CE, CS&E, CS, AI)
  • Excellent C/C++ programming and software design skills
  • Performance modelling, profiling, debug, and code optimization knowledge
  • Architectural knowledge of CPU and GPU
  • GPU programming experience (CUDA or OpenCL)
  • Software Agile skills
  • Python experience (preferred)

Interested in this job?

Jobs Related To NVIDIA Deep Learning Performance Architect

Senior ML Engineer

Senior ML Engineer position at NVIDIA focusing on manufacturing AI solutions, requiring 5+ years of experience in deploying machine learning models in production environments.

Senior Software Engineer - Conversational AI

Senior Software Engineer position at NVIDIA focusing on building next-generation Conversational AI systems and Digital Human solutions using advanced Speech and LLM models.

Senior Software Engineer, Deep Learning Inference

Senior Software Engineer role at NVIDIA focusing on optimizing deep learning inference performance and implementing AI runtime solutions.

Senior System Software Engineer, Deep Learning Accelerator

Senior System Software Engineer role at NVIDIA focusing on Deep Learning Accelerator development, requiring 7+ years of experience in low-level software development and system architecture.

Deep Learning Engineer, End-to-end - Autonomous Driving

Senior Deep Learning Engineer position at NVIDIA focusing on end-to-end autonomous driving solutions, combining AI expertise with automotive technology.