Senior System Software Engineer, Scalable ML Profiling Services

NVIDIA is the world leader in accelerated computing.
$180,000 - $339,250
Backend
Senior Software Engineer
Contact Company
8+ years of experience
AI

Description For Senior System Software Engineer, Scalable ML Profiling Services

We are seeking a deeply technical, creative, and hands-on software engineer to pioneer the next generation of scalable, always-available profiling services. This role will enable developers worldwide to harness the full power of NVIDIA GPUs. We are looking for someone who can help us build the best possible experience for ML performance engineers seeking to debug, profile, and optimize their training and serving pipelines using next-generation profiling technologies.

What you'll be doing: Develop tools and features for NVIDIA GPUs that enable ML engineers to profile long-running ML workloads on single node and multi-node clusters. Synthesize customer's performance analysis use cases into the key GPU performance metrics required to advise those insights. Use NVIDIA GPU performance monitoring system and design efficient hardware performance counter arrangements for observation. Optimize GPU profiling tools to minimize overheads, improve observability, and make smart tradeoffs between observability and observer effects. Innovate and improve our GPU profiling library with new features to maximize ML application performance.

What we need to see:

  • Strong proficiency in C, C++, and Python.
  • 8+ years of experience in system software development
  • B.S. or M.S. in Electrical Engineering, Computer Science, or related technical field (or equivalent experience).
  • Experience in building performance analysis developer tools
  • Strong computer science fundamentals, including algorithms, data structures, optimization, debugging, operating systems, parallel computing and computer architecture
  • Excellent written and verbal communication skills.

Ways to stand out from the crowd:

  • Background in working with drivers and system software.
  • Knowledge of GPU Compute APIs such as CUDA and OpenCL.
  • Prior experience developing tools for GPUs and Knowledge of compute architecture and operating systems.
  • Expertise in performance analysis, particularly for ML and GPU applications.
  • Demonstrate ability to select and implement efficient algorithms for complex problems.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Last updated 3 months ago

Responsibilities For Senior System Software Engineer, Scalable ML Profiling Services

  • Develop tools and features for NVIDIA GPUs for ML workload profiling
  • Synthesize customer's performance analysis use cases
  • Use NVIDIA GPU performance monitoring system
  • Design efficient hardware performance counter arrangements
  • Optimize GPU profiling tools
  • Innovate and improve GPU profiling library

Requirements For Senior System Software Engineer, Scalable ML Profiling Services

Python
Linux
  • Strong proficiency in C, C++, and Python
  • 8+ years of experience in system software development
  • B.S. or M.S. in Electrical Engineering, Computer Science, or related technical field (or equivalent experience)
  • Experience in building performance analysis developer tools
  • Strong computer science fundamentals, including algorithms, data structures, optimization, debugging, operating systems, parallel computing and computer architecture
  • Excellent written and verbal communication skills

Benefits For Senior System Software Engineer, Scalable ML Profiling Services

Equity
  • Equity

Interested in this job?

Jobs Related To NVIDIA Senior System Software Engineer, Scalable ML Profiling Services

Senior Verification Engineer, Memory Subsystem

Senior Verification Engineer role at NVIDIA, focusing on GPU memory subsystem verification with 4+ years experience required.

System Software Engineer, GPU Tools Development

Senior Software Engineer role at NVIDIA focusing on GPU tools development and simulation environments, requiring C++ expertise and computer architecture knowledge.

Senior Formal Verification Engineer

Senior Formal Verification Engineer role at NVIDIA, focusing on hardware verification for GPU/CPU designs with emphasis on formal verification methods and automation.

Senior System Software Engineer - GPU Virtualization

Senior System Software Engineer position at NVIDIA focusing on GPU virtualization, requiring 5+ years of experience in system software and strong C/C++ skills.

Compute Performance Developer Technology Engineer

Senior software development role at NVIDIA focusing on high-performance computing optimization and parallel programming for scientific and AI applications.