NVIDIA, the pioneer in GPU technology and AI innovation, is seeking a Senior DevOps Engineer to lead their GPU clusters infrastructure. This role sits at the intersection of high-performance computing and artificial intelligence, where you'll be responsible for designing and managing large-scale GPU clusters that power cutting-edge AI workloads.
The position offers an opportunity to work with state-of-the-art technology in a company that's driving the future of AI and machine learning. You'll be managing infrastructure that supports multiple teams and projects, making a direct impact on NVIDIA's AI initiatives. The role requires expertise in cloud technologies, infrastructure automation, and high-performance computing environments.
As a Senior DevOps Engineer, you'll be responsible for ensuring the reliability and efficiency of GPU clusters, implementing best practices in infrastructure as code, and maintaining high availability for critical systems. You'll work in a multi-cloud environment, dealing with AWS, GCP, Azure, and OCI, as well as on-premises infrastructure.
The ideal candidate should have a strong background in software engineering with specific experience in GPU cluster management or similar high-performance computing environments. You'll need to be proficient in container orchestration, infrastructure automation, and have excellent problem-solving skills. The role offers competitive compensation between $180,000 and $339,250, plus equity benefits.
This is an excellent opportunity for someone passionate about infrastructure automation and operational excellence, who wants to work at the forefront of AI technology. You'll be joining a diverse and experienced team, contributing to groundbreaking developments in artificial intelligence and high-performance computing at NVIDIA.