NVIDIA, the world leader in accelerated computing, is seeking a Senior Software Engineer for their Bare Metal Automation team within DGX Cloud. This role is crucial for scaling up NVIDIA's AI Infrastructure, focusing on managing and automating large-scale GPU clusters. The position combines hardware expertise with software engineering, requiring experience with bare metal hardware APIs and frameworks, particularly for GPU servers.
The role involves working with cutting-edge AI infrastructure, managing fleets of GPU nodes, and implementing sophisticated monitoring and health management systems. You'll be part of a team responsible for maintaining industry-leading reliability and performance of GPU clusters, working directly with NVIDIA's advanced computing technologies.
The ideal candidate brings 5+ years of experience in large-scale production systems, strong programming skills in languages like Go and Python, and a deep understanding of bare metal hardware automation. This position offers an opportunity to work at the forefront of AI computing, contributing to systems that power various AI workloads across industries.
NVIDIA offers a competitive compensation package, including a base salary range of $148,000-$276,000, equity, and comprehensive benefits. The company is known for its innovative culture and is consistently ranked as one of the technology world's most desirable employers. This role provides an excellent opportunity for those passionate about GPU hardware and AI infrastructure to make a significant impact in the field.