Senior Engineer - DevOps

NVIDIA is the world leader in accelerated computing. NVIDIA pioneered accelerated computing to tackle challenges no one else can solve. Our work in AI and digital twins is transforming the world's largest industries and profoundly impacting society.
DevOps
Senior Software Engineer
In-Person
5,000+ Employees
8+ years of experience
AI · Enterprise SaaS

Description For Senior Engineer - DevOps

NVIDIA is seeking an outstanding engineer to join its Software Infrastructure and Operations team. This fast-paced role involves developing and maintaining sophisticated Kubernetes-based development, build, and test environments for multiple platforms including Windows and Linux.

Key responsibilities include:

  • Designing and architecting scaling operations in data centers
  • Deploying and supporting end-to-end container management solutions with Kubernetes, Docker, and containerd
  • Managing Jenkins instances and developing automation tools
  • Implementing critical metrics tracking and data analytics
  • Prototyping and developing cloud infrastructure

The ideal candidate will have:

  • Strong Kubernetes understanding, especially for on-premises setups
  • Experience maintaining large-scale cloud/on-prem infrastructure applications
  • Proven programming background in Python/Golang/Java
  • Excellent debugging and analytical skills
  • Proficiency with configuration management tools and CI systems
  • Hands-on experience with VMs, Docker, and Kubernetes Clusters
  • 8+ years of proven experience
  • Bachelor's or Master's degree in CS, Software Engineering, or related field

NVIDIA offers competitive salaries and generous benefits, making it one of the most desirable employers in the technology world. This role presents an opportunity to work with forward-thinking colleagues on cutting-edge technology in a rapidly growing environment.

Last updated 13 hours ago

Responsibilities For Senior Engineer - DevOps

  • Design/Architect scaling operations in data centers
  • Deploy and support end-to-end container management solutions with Kubernetes, Docker, containerd
  • Design solutions with service discovery, networking, monitoring, logging, scheduling in Kubernetes
  • Setup and manage end-to-end Jenkins instances
  • Design and develop tools for automating maintenance of 10000+ hosts
  • Deploy new data center infrastructure
  • Plan and implement critical metrics tracking using data analytics mining methods and dashboards
  • Apply AI techniques to extract useful signals about machines and jobs from generated data
  • Prototype, craft, and develop cloud infrastructure for NVIDIA

Requirements For Senior Engineer - DevOps

Kubernetes
Linux
Python
Go
Java
MySQL
MongoDB
  • Strong Kubernetes understanding and background, especially on-premises setup
  • Experience maintaining large scale cloud/on-prem infrastructure applications using Kubernetes
  • Proven programming background in Python/Golang/Java and/or relevant scripting languages
  • Excellent debugging and analytical skills
  • Experience with SQL (MySQL) and NoSQL (Elastic Search/MongoDB) databases
  • Proficiency with configuration management tools like Ansible, Chef, Puppet
  • Strong experience with Jenkins and/or other CI systems
  • Hands-on experience with VMs, Docker, Kubernetes Cluster
  • Experience with analytics/visualization tools like Kibana, Grafana, Splunk
  • Experience with monitoring systems such as Zabbix and/or Nagios (nice to have)
  • 8+ years of proven experience
  • Bachelor's or Master's Degree or equivalent experience in CS, Software Engineering, or related field

Interested in this job?

Jobs Related To NVIDIA Senior Engineer - DevOps

Sr. Software Development Engineer in Test, CoRo

Lead test infrastructure development for Amazon Lab126's consumer robotics division, ensuring high-quality releases of innovative consumer electronics.

Senior Infrastructure Engineer- Client

Senior Infrastructure Engineer at Salesforce, managing client infrastructure and security projects in Bellevue, WA.

Senior Software Test Development Engineer

NVIDIA seeks a Senior Software Test Development Engineer for platform SWQA, focusing on test plan development, automation, and reliability analysis.

Launch Reliability Engineer (Operations and Automation)

SpaceX seeks a Launch Reliability Engineer to ensure mission success in crewed spaceflight and Mars colonization projects.

Senior Software Engineer

Senior Software Engineer role at Forter, focusing on developer experience and tools for a leading digital commerce trust platform.