NVIDIA is seeking a Principal Architect to work on a scalable hybrid cloud system for infrastructure services across multiple teams. This role involves crafting scalable cloud solutions to handle millions of jobs and thousands of systems, working with various NVIDIA groups such as Graphics Processors, Mobile Processors, Deep Learning, AI, and Autonomous Vehicles. The cloud services will run on thousands of servers, supporting a heterogeneous mix of machines with various operating systems and hardware platforms.
Key Responsibilities:
- Design creative, scalable cloud solutions for millions of jobs and thousands of systems
- Tackle challenging infrastructure problems in areas like NIMs, Kubernetes, job scheduling, and resource management
- Develop observability solutions to enhance system availability, reliability, and latency
- Collaborate with customers to understand needs and create innovative solutions
Requirements:
- Experience in architecting scalable cloud infrastructure solutions
- Expertise in Kubernetes
- Strong object-oriented programming skills (Java or Go preferred)
- Ability to collaborate across multiple teams and time zones
- Bachelor's degree or equivalent experience
- Strong software/hardware engineering background
- 12+ years of experience in infrastructure
Preferred Qualifications:
- Experience in design, implementation, and deployment of major infrastructure features
- Knowledge of AI/ML and Data Analytics applied to Infrastructure
- Experience with large-scale, multi-cluster Kubernetes environments
- Ability to design robust distributed systems for heterogeneous platforms
NVIDIA offers a competitive base salary range of $272,000 - $419,750 USD, along with equity and comprehensive benefits. Join NVIDIA to work with the most talented people in the world and advance Artificial Intelligence through innovative infrastructure solutions.