Senior Engineer - DevOps

NVIDIA is the world leader in accelerated computing. NVIDIA pioneered accelerated computing to tackle challenges no one else can solve. Our work in AI and digital twins is transforming the world's largest industries and profoundly impacting society.
DevOps
Senior Software Engineer
In-Person
5,000+ Employees
8+ years of experience
AI · Enterprise SaaS

Description For Senior Engineer - DevOps

NVIDIA is seeking an outstanding engineer to join its Software Infrastructure and Operations team. This fast-paced role involves developing and maintaining sophisticated Kubernetes-based development, build, and test environments for multiple platforms including Windows and Linux.

Key responsibilities include:

  • Designing and architecting scaling operations in data centers
  • Deploying and supporting end-to-end container management solutions with Kubernetes, Docker, and containerd
  • Managing Jenkins instances and developing automation tools
  • Implementing critical metrics tracking and data analytics
  • Prototyping and developing cloud infrastructure

The ideal candidate will have:

  • Strong Kubernetes understanding, especially for on-premises setups
  • Experience maintaining large-scale cloud/on-prem infrastructure applications
  • Proven programming background in Python/Golang/Java
  • Excellent debugging and analytical skills
  • Proficiency with configuration management tools and CI systems
  • Hands-on experience with VMs, Docker, and Kubernetes Clusters
  • 8+ years of proven experience
  • Bachelor's or Master's degree in CS, Software Engineering, or related field

NVIDIA offers competitive salaries and generous benefits, making it one of the most desirable employers in the technology world. This role presents an opportunity to work with forward-thinking colleagues on cutting-edge technology in a rapidly growing environment.

Last updated 9 days ago

Responsibilities For Senior Engineer - DevOps

  • Design/Architect scaling operations in data centers
  • Deploy and support end-to-end container management solutions with Kubernetes, Docker, containerd
  • Design solutions with service discovery, networking, monitoring, logging, scheduling in Kubernetes
  • Setup and manage end-to-end Jenkins instances
  • Design and develop tools for automating maintenance of 10000+ hosts
  • Deploy new data center infrastructure
  • Plan and implement critical metrics tracking using data analytics mining methods and dashboards
  • Apply AI techniques to extract useful signals about machines and jobs from generated data
  • Prototype, craft, and develop cloud infrastructure for NVIDIA

Requirements For Senior Engineer - DevOps

Kubernetes
Linux
Python
Go
Java
MySQL
MongoDB
  • Strong Kubernetes understanding and background, especially on-premises setup
  • Experience maintaining large scale cloud/on-prem infrastructure applications using Kubernetes
  • Proven programming background in Python/Golang/Java and/or relevant scripting languages
  • Excellent debugging and analytical skills
  • Experience with SQL (MySQL) and NoSQL (Elastic Search/MongoDB) databases
  • Proficiency with configuration management tools like Ansible, Chef, Puppet
  • Strong experience with Jenkins and/or other CI systems
  • Hands-on experience with VMs, Docker, Kubernetes Cluster
  • Experience with analytics/visualization tools like Kibana, Grafana, Splunk
  • Experience with monitoring systems such as Zabbix and/or Nagios (nice to have)
  • 8+ years of proven experience
  • Bachelor's or Master's Degree or equivalent experience in CS, Software Engineering, or related field

Interested in this job?

Jobs Related To NVIDIA Senior Engineer - DevOps

Senior Software Development Engineer in Test

Senior SDET role at NVIDIA focusing on cloud infrastructure and distributed systems testing

Senior Release Engineer - Server Software

Senior Release Engineer position at NVIDIA, managing software and firmware releases for enterprise AI infrastructure, offering competitive salary and benefits.

Senior PCIe DevOps, Automation and Verification Engineer

Senior PCIe DevOps Engineer role at NVIDIA, focusing on automation and verification of PCIe technology, requiring 6+ years of experience in DevOps and hardware verification.

Senior Software Test Development Engineer

NVIDIA seeks a Senior Software Test Development Engineer for platform SWQA, focusing on test plan development, automation, and reliability analysis.

Senior DevOps Engineer - GPU Clusters

Senior DevOps Engineer for GPU Clusters at NVIDIA, leading large-scale AI infrastructure design and management.