Distinguished Software Architect - Deep Learning and HPC Communications

World leader in accelerated computing, pioneering AI and digital twins technology.
$308,000 - $471,500
Machine Learning
Principal Software Engineer
In-Person
5000+ Employees
15+ years of experience
AI · Enterprise SaaS

Description For Distinguished Software Architect - Deep Learning and HPC Communications

NVIDIA, the pioneer in GPU technology and accelerated computing, is seeking a Distinguished Software Architect to join their GPU Communications Libraries and Networking team. This role focuses on developing communication libraries like NCCL, NVSHMEM, and UCX for Deep Learning and HPC applications. The position involves working with systems that scale to thousands of GPUs, optimizing communication performance between GPUs using high-speed interconnects.

The ideal candidate will be an industry-recognized leader in HPC/DL communications with a strong background in parallel programming, system architecture, and deep learning. They will be responsible for researching new communication technologies, co-designing next-generation platforms, and driving innovation in hardware and software solutions.

This is a unique opportunity to work at the intersection of artificial intelligence and high-performance computing, helping shape the future of technology that powers everything from scientific discovery to autonomous vehicles. The role offers competitive compensation, including a substantial base salary range of $308,000 - $471,500, plus equity and benefits.

NVIDIA's commitment to innovation and forward-thinking culture makes it one of the most desirable employers in the technology sector. The company values creativity, autonomy, and diversity, fostering an inclusive environment where breakthrough technologies come to life. This role represents a chance to work on cutting-edge technology that directly impacts the advancement of AI and HPC applications worldwide.

Last updated 16 hours ago

Responsibilities For Distinguished Software Architect - Deep Learning and HPC Communications

  • Research new communication technologies and design new features for communication libraries
  • Propose innovative solutions in HW and SW for next-gen platforms
  • Co-design solutions with GPU, Networking, and SW architects
  • Drive adoption of new communication technologies across application verticals
  • Collaborate with DL researchers and customers
  • Keep up with latest DL research

Requirements For Distinguished Software Architect - Deep Learning and HPC Communications

Python
  • PhD in Computer Science, Computer Engineering or related field or equivalent experience
  • 15+ years of relevant experience in academia or industry
  • Expert in HPC, parallel programming models, communication runtime
  • Deep understanding of high performance networking
  • Strong knowledge of ML/DL fundamentals
  • Programming fluency with C or C++
  • Experience with DL Frameworks (PyTorch, TensorFlow)
  • Ability to work across different HW/SW teams and timezones

Benefits For Distinguished Software Architect - Deep Learning and HPC Communications

Equity
  • Equity
  • Benefits package

Interested in this job?

Jobs Related To NVIDIA Distinguished Software Architect - Deep Learning and HPC Communications

Principal Member Technical Staff

Principal Technical Staff role at Oracle building AI cloud services, requiring 10+ years experience in ML and distributed systems

Director AI/ML Strategic Customers Engineering

Director position leading AI/ML strategic customer engineering initiatives at Oracle Cloud Infrastructure, combining technical expertise with customer relationship management.

Software Development Senior Director - AI Service

Senior AI/ML leadership role at Oracle focusing on Gen AI solutions development and team management, requiring 10+ years of cloud engineering experience.

Software Development Senior Director - AI Service

Senior AI leadership role at Oracle focusing on Gen AI solutions development and team management, requiring 10+ years of cloud engineering experience.

Principal ML Engineer

Principal ML Engineer position at AminoChain, leading AI and NLP development for healthcare applications.