NVIDIA is seeking a highly motivated Senior High-Performance System Architect to join their team of experts and help shape the future of high-performance and ML / AI computing. The role involves defining the Infiniband and NVL system architecture end-to-end, researching solutions for next-generation large-scale high-performance computing clusters, and collaborating with cross-functional teams.
Key responsibilities include:
- Defining Infiniband and NVL system architecture throughout all product life cycles
- Researching solutions for large-scale high-performance computing clusters
- Collaborating with cross-functional teams to ensure successful project execution
Requirements:
- B.Sc, M.Sc, or Ph.D in Computer Science, Computer Engineering, or Electrical Engineering
- 5+ years of industry or research experience in computer networks
- Excellent understanding of large-scale networks behavior and distributed computing workloads
- Experience in developing simulation environments
- Strong managerial, problem-solving, and critical thinking skills
Preferred qualifications:
- Knowledge of network protocols (InfiniBand, IP, TCP, RoCE) and network topologies
- Proficiency in Python and C++
- Familiarity with HPC environments, routing algorithms, and simulation environments
- Experience with AI workloads and communication libraries
NVIDIA offers the opportunity to work on cutting-edge technology and drive innovation in next-generation networks used by top researchers and engineers worldwide. The company is committed to fostering a diverse work environment and is an equal opportunity employer.