Senior Software Developer, HPC Cluster Management

NVIDIA is the world leader in accelerated computing, pioneering AI and digital twins technology to transform industries.
$184,000 - $356,500
Backend
Senior Software Engineer
Remote
5,000+ Employees
7+ years of experience
AI · Enterprise SaaS

Description For Senior Software Developer, HPC Cluster Management

NVIDIA, a global leader in accelerated computing and AI technology, is seeking a Senior Software Developer to join their HPC Cluster Management team. This role focuses on developing and maintaining NVIDIA's Bright Cluster Manager, a sophisticated Linux-based cluster management software that powers thousands of clusters worldwide. The position combines deep technical expertise in Linux systems, distributed computing, and modern cloud technologies with hands-on development of critical infrastructure components.

The role involves working on cutting-edge hardware integration, including GPUs, DPUs, and high-speed interconnects, while developing scalable solutions for cluster management. You'll be responsible for improving cluster provisioning, edge deployment, and firmware management systems that support clusters ranging from small installations to massive deployments with thousands of nodes.

This is an excellent opportunity for an experienced developer who is passionate about high-performance computing and system-level software development. The position offers competitive compensation, including a substantial base salary range of $184,000 - $356,500, plus equity and comprehensive benefits. Working at NVIDIA means joining a team that's driving innovation in AI, digital twins, and accelerated computing, with the chance to make a lasting impact on world-changing technologies.

The ideal candidate will bring strong Python programming skills, deep Linux expertise, and experience with modern infrastructure tools like Ansible and Kubernetes. This role combines technical leadership with hands-on development, requiring both architectural vision and practical implementation skills.

Last updated 2 hours ago

Responsibilities For Senior Software Developer, HPC Cluster Management

  • Development of head node and compute node installation and provisioning processes
  • Work on edge site deployment functionality
  • Integrate product with latest hardware (GPUs, DPUs, accelerators, high-speed interconnects)
  • Work on composable infrastructure management features
  • Develop BIOS and firmware upgrade management features
  • Develop scalability features for clusters
  • Add support for new Linux distributions
  • Improve support for alternative CPU architectures like ARM
  • Work on Ansible collections for Cluster Installation and Management
  • Assist support team with customer requests

Requirements For Senior Software Developer, HPC Cluster Management

Python
Linux
Kubernetes
  • Degree in Computer Science or related field (or equivalent experience)
  • 7+ years of experience in software development
  • Strong familiarity with Linux operating system and networking concepts
  • Proficiency in Python and object oriented software design
  • Knowledge of design patterns and concurrent programming techniques
  • Focus on high quality work and clean code
  • Eagerness to learn and use new technologies

Benefits For Senior Software Developer, HPC Cluster Management

Equity
  • Equity
  • Benefits package

Interested in this job?

Jobs Related To NVIDIA Senior Software Developer, HPC Cluster Management

Senior System Software Engineer

Senior System Software Engineer role at NVIDIA, developing core infrastructure services for Cumulus Linux, the network operating system powering AI-focused data centers.

Senior Software Verification Engineer

Senior Software Verification Engineer position at NVIDIA, focusing on networking products and BlueField network cards, requiring 5+ years of experience in software development and strong Python skills.

Senior CUDA Compute Systems Software Engineer

Senior CUDA Compute Systems Software Engineer role at NVIDIA, focusing on kernel-level drivers development for AI and Data Center products, offering competitive compensation and growth opportunities.

Senior System Software Engineer, CUDA Driver for Windows

Senior System Software Engineer position at NVIDIA working on CUDA Driver for Windows, focusing on GPU acceleration and system-level programming.

Senior Software Engineer - SONiC Design Group

Senior Software Engineer position at NVIDIA focusing on SONiC Network OS development for high-performance AI networking infrastructure.