Senior DevOps Engineer - AI Infrastructure

NVIDIA is the world leader in accelerated computing, pioneering solutions in AI and digital twins.
DevOps
Senior Software Engineer
In-Person
4+ years of experience
AI · Automotive

Description For Senior DevOps Engineer - AI Infrastructure

NVIDIA is seeking a Senior DevOps Engineer to join their AI Infrastructure team, focusing on scaling up AI infrastructure capabilities. This role presents an exciting opportunity to work at the forefront of AI and autonomous vehicle technology. The position combines strong programming skills with deep understanding of cloud technologies and automation systems.

The role is part of the AI Infrastructure Software team, offering a dynamic, startup-like environment with a strong focus on execution and teamwork. You'll be instrumental in creating and scaling out a new product category, working specifically on autonomous vehicle infrastructure at present. The position involves building and maintaining critical infrastructure that supports NVIDIA's AI-based applications across autonomous vehicles, healthcare, virtual reality, and visual computing.

Key responsibilities include collaborating with AI product teams, building infrastructure tools for AI systems development, implementing automated build and test solutions, and maintaining production systems. You'll work with cutting-edge technologies including cloud computing, Kubernetes, and Docker, while being part of a team that's shaping the future of AI on NVIDIA GPUs.

The ideal candidate brings 4+ years of experience, strong technical foundation in automation and cloud infrastructure, and expertise in modern DevOps tools and practices. Experience with cloud platforms, CI/CD pipelines, and programming languages like Go and Python is essential. Knowledge of observability tools and autonomous vehicle development is highly valued.

Join NVIDIA, one of technology's most desirable employers, and be part of a team that's driving innovation in AI infrastructure. This role offers the opportunity to work with forward-thinking professionals while contributing to groundbreaking advancements in AI and autonomous vehicle technology.

Last updated 11 minutes ago

Responsibilities For Senior DevOps Engineer - AI Infrastructure

  • Collaborate with AI product teams to understand data and compute requirements
  • Build infrastructure and tools for AI-based systems development
  • Enable development team with automated build and test solutions
  • Maintain version control schemas using git
  • Orchestrate live systems using maintenance windows and HA failover
  • Integrate NVIDIA products into CI workflow
  • Automate tasks and improve functional automated tests
  • Participate in on-call rotation for production systems support

Requirements For Senior DevOps Engineer - AI Infrastructure

Go
Python
Linux
Kubernetes
  • BS/MS with 4+ years of experience
  • Experience with orchestration systems (Kubernetes, Swarm, Mesos, etc)
  • Experience with microservices and ETL jobs
  • Experience with cloud automation tools (Ansible, Terraform)
  • Understanding of AWS or equivalent cloud services
  • Experience with CI/CD tools (Jenkins, GitHub, GitLab)
  • Programming skills in Go, Python, Bash
  • Linux system administration skills
  • Networking knowledge (Linux firewall, PXE, NFS, ZFS, CIFS)
  • Understanding of observability tools (Prometheus, Grafana, OpenTelemetry)
  • Fluent English

Interested in this job?

Jobs Related To NVIDIA Senior DevOps Engineer - AI Infrastructure

Senior DevOps Engineer

Senior DevOps Engineer role at NVIDIA, leading CI/CD infrastructure development and automation, offering competitive salary and opportunity to work with cutting-edge AI technology.

Senior HPC DevOps Engineer

Senior HPC DevOps Engineer role at NVIDIA focusing on building and maintaining large-scale supercomputers and HPC clusters for AI and GPU computing advancement.

Senior DevOps and Automation Engineer, Fabric Networking - GPU

Senior DevOps role at NVIDIA focusing on GPU cluster management, automation, and infrastructure development for high-performance computing systems.

Senior CUDA Driver, Legate, and Build Engineer

Senior DevOps role at NVIDIA focusing on CUDA driver development and build system automation, offering competitive compensation and opportunity to work with cutting-edge technology.

Senior Enterprise Software Test Development Engineer

Senior Enterprise Software Test Development Engineer role at NVIDIA, focusing on automation, DevOps, and quality assurance for enterprise server platforms.