Senior System Software Engineer, NCCL - Partner Enablement

World leader in accelerated computing, pioneering AI and digital twins technology.
$148,000 - $287,500
Distributed Systems
Senior Software Engineer
Remote
5,000+ Employees
5+ years of experience
AI · Enterprise SaaS

Description For Senior System Software Engineer, NCCL - Partner Enablement

NVIDIA, the pioneer in GPU technology and accelerated computing, is seeking a Senior System Software Engineer for their NCCL Partner Enablement team. This role sits at the intersection of deep learning, high-performance computing, and distributed systems, focusing on the GPU Communications Libraries and Networking team. The position involves working with NCCL and NVSHMEM runtimes for Deep Learning and HPC applications, providing an exceptional opportunity to understand the AI networking stack comprehensively.

The role combines technical expertise with customer-facing responsibilities, requiring deep knowledge of parallel programming, high-performance networking, and system optimization. You'll be working with cutting-edge GPU clusters, cloud platforms, and networking technologies while guiding partners and customers through complex technical challenges. The position offers exposure to groundbreaking developments in AI and HPC, working with various internal teams across different time zones.

This is an ideal opportunity for someone passionate about distributed systems and AI infrastructure, with strong programming skills and experience in HPC or AI environments. The role offers competitive compensation, including equity, and the flexibility to work remotely while being part of a team that's pushing the boundaries of what's possible in AI and high-performance computing. NVIDIA's commitment to diversity and innovation makes this an excellent opportunity for those looking to make a significant impact in the field of accelerated computing.

Last updated 5 days ago

Responsibilities For Senior System Software Engineer, NCCL - Partner Enablement

  • Engage with partners and customers to root cause functional and performance issues reported with NCCL
  • Conduct performance characterization and analysis of NCCL and DL applications on GPU clusters
  • Develop tools and automation to isolate issues on new systems and platforms
  • Guide customers and support teams on HPC knowledge and methodologies
  • Document and conduct trainings/webinars for NCCL
  • Engage with internal teams across different time zones

Requirements For Senior System Software Engineer, NCCL - Partner Enablement

Python
Linux
Kubernetes
  • B.S./M.S. degree in CS/CE or equivalent experience with 5+ years of relevant experience
  • Experience with parallel programming and communication runtime
  • Excellent C/C++ programming skills
  • Experience working with engineering or academic research community supporting HPC or AI
  • Practical experience with high performance networking
  • Expert in Linux fundamentals and Python scripting
  • Familiar with containers, cloud provisioning and scheduling tools
  • Flexibility to work across different teams and timezones

Benefits For Senior System Software Engineer, NCCL - Partner Enablement

Equity
  • Equity
  • Additional benefits available but not specified in detail

Interested in this job?

Jobs Related To NVIDIA Senior System Software Engineer, NCCL - Partner Enablement

Senior Software Engineer-Distributed Inference

Senior Software Engineer position at NVIDIA focusing on distributed inference and AI performance optimization, offering competitive compensation and remote work options.

Senior HPC Performance Engineer

Senior HPC Performance Engineer role at NVIDIA focusing on GPU Communications Libraries and Networking, optimizing performance for deep learning and HPC applications.

Senior Generalist Software Engineer -- Omniverse

Senior Generalist Software Engineer position at NVIDIA focusing on Omniverse, computer graphics, and compute systems development in Taiwan.

Senior AI-HPC Storage Engineer

Senior AI-HPC Storage Engineer role at NVIDIA, focusing on designing and implementing distributed storage solutions for AI and HPC workloads, offering competitive compensation and benefits.

Senior GPU Cluster Software Engineer

Senior GPU Cluster Software Engineer position at NVIDIA, focusing on building profiling solutions for large-scale ML/DL applications on GPU compute clusters.