NVIDIA, a leader in High-Performance Computing, Artificial Intelligence, and Visualization, is seeking an HPC Operations Manager for their Hardware Engineering team. This role involves leading a multi-national team of sysadmins and devops engineers, ensuring high reliability of HPC clusters, and collaborating with partners to develop programs for storage, networking, and compute in data centers. Key responsibilities include evaluating technologies, planning hardware deployments, managing HPC schedulers, tracking software licensing, and communicating with senior management. The ideal candidate will have extensive experience in IT infrastructure management, Linux servers, HPC schedulers, and hardware design workflows. This position offers the opportunity to work on cutting-edge technology and contribute to the development of next-generation GPUs and SOCs.
Responsibilities:
- Lead and mentor a multi-national team of sysadmins and devops engineers
- Ensure high reliability of HPC clusters and develop critical metrics
- Evaluate latest technologies and recommend infrastructure evolution
- Manage HPC scheduler (LSF) and drive high utilization
- Collaborate with hardware engineering leaders to support chip design needs
- Develop and manage program schedules, milestones, and deliverables
- Communicate program status to senior management
Requirements:
- B.S. or M.S. in Computer Science, Computer Engineering, or Information Science
- 15+ years overall experience
- 5+ years managing IT infrastructure teams of 10+ people
- 10+ years experience with Linux servers, NFS storage, and Ethernet networks
- Knowledge of HPC schedulers (IBM LSF preferred)
- Experience with hardware design workflows (EDA tools and methodology)
- Project management and capacity planning skills
Preferred Skills:
- Experience with HPC storage systems
- Infiniband expertise
- Software development in a devops context
- Knowledge of databases and analytics platforms
- Experience with FlexLM-based software license servers
- Established relationships with enterprise-level equipment suppliers
NVIDIA offers a competitive salary range, equity, and comprehensive benefits. They are committed to fostering a diverse work environment and are an equal opportunity employer.