Senior System Software Engineer, Distributed Systems - DGX Cloud

NVIDIA is the world leader in accelerated computing, pioneering solutions in AI and digital twins.
$148,000 - $287,500
Distributed Systems
Senior Software Engineer
Hybrid
6+ years of experience
AI · Enterprise SaaS · Cloud

Description For Senior System Software Engineer, Distributed Systems - DGX Cloud

NVIDIA, the world leader in accelerated computing, is seeking a Senior System Software Engineer to join their DGX Cloud team. This role presents an exciting opportunity to work at the forefront of AI infrastructure development, focusing on distributed systems and cloud solutions.

The position involves designing and developing sophisticated platforms that manage GPU assets across cloud providers, requiring deep expertise in distributed systems and firmware development. You'll be working with cutting-edge technology, collaborating with cross-functional teams to build solutions that power AI applications at scale.

As a Senior Engineer, you'll be responsible for creating robust, scalable solutions for datacenter firmware, ensuring high reliability and availability across cloud platforms. The role requires strong technical skills in Python and Go, combined with comprehensive knowledge of system architecture and cloud infrastructure.

The ideal candidate brings 6+ years of experience in Python development on Linux systems, along with expert-level understanding of distributed systems concepts. You'll need strong communication skills to work effectively across organizational boundaries and geographies.

NVIDIA offers a competitive compensation package with a base salary range of $148,000 to $287,500, plus equity benefits. The company is known for its innovative culture and commitment to pushing technological boundaries in AI and accelerated computing.

This is an exceptional opportunity for someone passionate about distributed systems and cloud infrastructure to join a company that's driving the future of AI technology. You'll be working on challenges that directly impact the advancement of AI infrastructure while being part of a team that values creativity, autonomy, and technical excellence.

The role offers significant growth potential and the chance to work with some of the industry's brightest minds. If you're excited about building the future of AI infrastructure and want to be part of a company that's revolutionizing computing, this position at NVIDIA could be your next career milestone.

Last updated 2 months ago

Responsibilities For Senior System Software Engineer, Distributed Systems - DGX Cloud

  • Design and architect a platform that automates GPU asset provisioning across cloud providers
  • Develop and optimize solutions for Datacenter firmware throughout lifecycle
  • Work with hardware, software, and infrastructure teams
  • Define server-level reliability and availability requirements
  • Drive failure analysis and large scale solution deployment
  • Ensure software integration from hardware to AI training applications

Requirements For Senior System Software Engineer, Distributed Systems - DGX Cloud

Python
Go
Linux
  • BS, MS, or PhD in EE/CS or related field
  • 6+ years of experience with Python development on Linux
  • Strong communication skills and ability to work with cross-functional teams
  • Knowledge of industry standards (SPI, I2C, PCIe, UEFI, PLDM)
  • Expert level knowledge of systems programming (Go, Python)
  • Understanding of distributed systems, data synchronization, and fault tolerance
  • Experience with system firmware/software and platform management

Benefits For Senior System Software Engineer, Distributed Systems - DGX Cloud

Equity
  • Equity

Interested in this job?

Jobs Related To NVIDIA Senior System Software Engineer, Distributed Systems - DGX Cloud

Senior AI-HPC Storage Engineer

Senior AI-HPC Storage Engineer position at NVIDIA focusing on designing and implementing distributed storage solutions for AI and HPC workloads.

Senior Software Engineer, GPU Communications and Networking

Senior Software Engineer role at NVIDIA focusing on GPU Communications and Networking, developing high-performance computing systems and deep learning frameworks.

Senior Software Engineer - HPC

Senior Software Engineer position at NVIDIA focusing on HPC infrastructure, requiring 10+ years of experience in distributed systems and cloud computing.

Systems Engineer, Enterprise

Senior Systems Engineer position at NVIDIA focusing on enterprise HPC server deployment, requiring 6+ years experience and strong hardware/software expertise.

Senior System Software Engineer, Metropolis

Senior System Software Engineer role at NVIDIA Metropolis division, focusing on scalable Digital Twin and Synthetic Data Generation solutions with competitive compensation.