NVIDIA, the pioneer in accelerated computing and inventor of the GPU, is seeking a Senior DevOps and Automation Engineer for their Fabric Networking - GPU team. This role is crucial in developing and maintaining software that facilitates GPU communication for High Performance Computing and Deep Learning solutions.
The position involves working with cutting-edge technology, including large GPU clusters interconnected via NVLink and InfiniBand. You'll be responsible for developing automated tools for cluster deployment, implementing modern DevOps practices, and ensuring optimal cluster performance. This role combines hands-on technical expertise with collaborative teamwork across multiple time zones.
The ideal candidate will bring strong expertise in automation tools like Ansible and Python, along with deep knowledge of Linux systems and cluster management. Experience with GPU-focused hardware and software, particularly DGX systems and Compute Clusters, would be highly valuable. The role offers exposure to groundbreaking developments in Artificial Intelligence and High-Performance Computing.
NVIDIA offers a competitive compensation package, including a base salary range of $148,000 - $287,500 USD, equity, and comprehensive benefits. This is an opportunity to join a company at the forefront of AI and accelerated computing, working on technology that powers everything from artificial intelligence to autonomous vehicles. The position offers flexibility with remote work options while being part of a team that's driving innovation in the industry.