Senior Software Engineer, Bare Metal Automation - DGX Cloud

World leader in accelerated computing, pioneering AI and digital twins technology.
$148,000 - $276,000
Cloud
Senior Software Engineer
Remote
5,000+ Employees
5+ years of experience
AI · Enterprise SaaS

Description For Senior Software Engineer, Bare Metal Automation - DGX Cloud

NVIDIA, the world leader in accelerated computing, is seeking a Senior Software Engineer for their Bare Metal Automation team within DGX Cloud. This role combines hardware expertise with software engineering excellence to scale AI Infrastructure. The position involves managing large GPU clusters, implementing monitoring systems, and ensuring reliable operation of AI workloads.

The ideal candidate will have extensive experience with bare metal hardware automation, including managing software/firmware versions and working with baseboard management controllers. You'll be part of a team that's pushing the boundaries of AI infrastructure, working with cutting-edge GPU technology and distributed systems.

This role offers an opportunity to work at the intersection of hardware and software, developing solutions that power NVIDIA's AI computing initiatives. You'll be responsible for implementing monitoring and health management capabilities that enable industry-leading reliability, availability, and scalability of GPU assets.

The position comes with competitive compensation, including a base salary range of $148,000 - $276,000, plus equity benefits. NVIDIA offers a collaborative environment where creativity and autonomous thinking are valued, working alongside some of the technology industry's most forward-thinking professionals.

As part of NVIDIA's DGX Cloud team, you'll be at the forefront of the AI computing era, contributing to systems that enable a broad range of AI-based applications. The role combines technical depth with cross-functional collaboration, making it ideal for those who enjoy both technical challenges and teamwork.

Last updated 8 days ago

Responsibilities For Senior Software Engineer, Bare Metal Automation - DGX Cloud

  • Work on DGX Cloud team managing production systems for large scalable GPU clusters
  • Implement monitoring and health management capabilities for GPU assets
  • Manage fleet of GPU nodes
  • Work with cross-functional teams to ensure production AI clusters run reliably
  • Evaluate system failures and improve services through incident management

Requirements For Senior Software Engineer, Bare Metal Automation - DGX Cloud

Python
Go
  • 5+ years experience in similar role with large-scale production systems
  • Direct software engineering experience with bare metal hardware APIs
  • BS in Computer Science, Engineering, Physics, Mathematics or equivalent
  • Strong communication skills and ability to work with cross-functional teams
  • Proficiency in systems programming languages (Go, Python)
  • Solid understanding of data structures and algorithms

Benefits For Senior Software Engineer, Bare Metal Automation - DGX Cloud

Equity
  • Equity

Interested in this job?

Jobs Related To NVIDIA Senior Software Engineer, Bare Metal Automation - DGX Cloud

Senior Software Engineer, Kubernetes - DGX Cloud

Senior Software Engineer position at NVIDIA focusing on Kubernetes development for DGX Cloud, working on GPU resource scheduling and cluster management for AI workloads.

Senior DGX Cloud Software Engineer- Infrastructure Automation and Distributed Systems

Senior Cloud Engineer role at NVIDIA focusing on infrastructure automation and distributed systems for DGX cloud services.

Senior AI-HPC Storage Engineer

Senior AI-HPC Storage Engineer role at NVIDIA, focusing on designing and implementing advanced storage solutions for AI and high-performance computing environments.

Senior Cloud Platform Software Engineer

Senior Cloud Platform Engineer role at NVIDIA building scalable cloud services for AI workloads, requiring 12+ years of experience in platform engineering and expertise in Kubernetes.

Senior Software Engineer, Reliability and Operational Excellence - DGX Cloud

Senior Software Engineer position focused on reliability and operational excellence for NVIDIA's DGX Cloud platform.