Senior System Software Engineer, NCCL - Partner Enablement

World leader in accelerated computing, pioneering AI and digital twins technology to transform industries.
$148,000 - $287,500
Distributed Systems
Senior Software Engineer
Remote
5,000+ Employees
5+ years of experience
AI · Enterprise SaaS

Description For Senior System Software Engineer, NCCL - Partner Enablement

NVIDIA, the pioneer in GPU technology and accelerated computing, is seeking a Senior System Software Engineer for their GPU Communications Libraries and Networking team. This role focuses on NCCL and NVSHMEM communication runtimes for Deep Learning and HPC applications. The position offers an exceptional opportunity to work with cutting-edge AI networking stack and large-scale GPU clusters.

The role involves working with partners and customers to optimize NCCL performance, conducting analysis on GPU clusters, and developing automation tools for various cloud platforms. You'll be at the forefront of AI and HPC technology, working with state-of-the-art systems and contributing to NVIDIA's groundbreaking developments in artificial intelligence and high-performance computing.

The ideal candidate brings strong expertise in C/C++ programming, parallel computing, and high-performance networking protocols. Experience with cloud platforms, containerization, and deep learning frameworks is highly valued. This position offers the chance to work with diverse teams globally and impact the future of AI computing infrastructure.

Working at NVIDIA means joining a company that's transforming industries through AI and digital twins technology. The role offers competitive compensation, including equity, and the opportunity to work with leading-edge technology in a flexible, remote-friendly environment. If you're passionate about high-performance computing and want to contribute to advancing AI technology, this role provides an excellent opportunity to make a significant impact.

Last updated 5 days ago

Responsibilities For Senior System Software Engineer, NCCL - Partner Enablement

  • Engage with partners and customers to root cause functional and performance issues with NCCL
  • Conduct performance characterization and analysis of NCCL and DL applications on GPU clusters
  • Develop tools and automation to isolate issues on new systems and platforms
  • Guide customers and support teams on HPC knowledge
  • Document and conduct trainings/webinars for NCCL
  • Engage with internal teams across different time zones

Requirements For Senior System Software Engineer, NCCL - Partner Enablement

Python
Linux
Kubernetes
  • B.S./M.S. degree in CS/CE or equivalent experience with 5+ years of relevant experience
  • Experience with parallel programming and communication runtime
  • Excellent C/C++ programming skills
  • Experience with high performance networking
  • Expert in Linux fundamentals and Python
  • Familiar with containers, cloud provisioning and scheduling tools
  • Flexibility to work across different teams and timezones

Benefits For Senior System Software Engineer, NCCL - Partner Enablement

Equity
  • Equity

Interested in this job?

Jobs Related To NVIDIA Senior System Software Engineer, NCCL - Partner Enablement

Senior Software Engineer-Distributed Inference

Senior Software Engineer role at NVIDIA focusing on distributed inference systems and AI performance optimization tools, offering competitive compensation and remote work options.

Senior AI-HPC Storage Engineer

Senior AI-HPC Storage Engineer position at NVIDIA focusing on designing and implementing distributed storage solutions for AI and HPC workloads.

Senior Software Engineer - HPC

Senior Software Engineer position at NVIDIA focusing on HPC infrastructure, requiring 10+ years of experience in distributed systems and cloud computing.

Systems Engineer, Enterprise

Senior Systems Engineer position at NVIDIA focusing on enterprise HPC server deployment, requiring 6+ years experience and strong hardware/software expertise.

Senior System Software Engineer, Distributed Systems - DGX Cloud

Senior System Software Engineer position at NVIDIA focusing on distributed systems and DGX Cloud infrastructure.