Senior AI-HPC Storage Engineer

World leader in accelerated computing, pioneering AI and digital twins technology to transform industries.
Santa Clara, CA, USAWestford, MA 01886, USAAustin, TX, USA
$184,000 - $356,500
Distributed Systems
Senior Software Engineer
In-Person
5,000+ Employees
8+ years of experience
AI · Enterprise SaaS

Description For Senior AI-HPC Storage Engineer

NVIDIA, a global leader in accelerated computing and AI technology, is seeking a Senior AI-HPC Storage Engineer to join their GPU AI/HPC Infrastructure team. This role combines cutting-edge storage solutions with AI and high-performance computing, making it an exciting opportunity for experienced engineers passionate about large-scale infrastructure.

The position involves designing and implementing groundbreaking fast storage solutions to enable demanding deep learning and HPC workloads. You'll be at the forefront of developing next-generation storage architectures, encompassing file, block, and object storage, to support NVIDIA's expanding cloud infrastructure. The role requires expertise in distributed systems, performance optimization, and cloud technologies.

As a Senior AI-HPC Storage Engineer, you'll work with state-of-the-art technology and collaborate with talented researchers and developers. The role offers exposure to NVIDIA's innovative work in AI and digital twins, which is transforming major industries. You'll have the opportunity to influence the direction of storage solutions for one of the world's leading technology companies.

The position offers a competitive compensation package, including a base salary range of $184,000 to $356,500, plus equity and comprehensive benefits. NVIDIA's commitment to diversity and inclusion, combined with their culture of continuous innovation and learning, makes this an ideal opportunity for someone looking to make a significant impact in the field of AI and HPC infrastructure.

The ideal candidate will bring 8+ years of experience in large-scale storage infrastructure, strong expertise in distributed systems, and a deep understanding of AI/HPC workloads. Knowledge of cloud platforms, container technologies, and Linux systems is essential. Experience with NVIDIA GPUs, CUDA programming, and deep learning frameworks would be particularly valuable.

Last updated 5 days ago

Responsibilities For Senior AI-HPC Storage Engineer

  • Research and implement distributed storage services
  • Design and implement on-prem AI/HPC infrastructure with cloud computing support
  • Design scalable and efficient next-gen storage solutions for data-intensive applications
  • Develop tooling for automation of large-scale infrastructure management
  • Document procedures and practices related to distributed file systems
  • Collaborate with teams to understand developers' workflows
  • Guide methodologies for building, testing, and deploying applications
  • Support researchers with performance analysis and optimizations
  • Perform root cause analysis and suggest corrective actions

Requirements For Senior AI-HPC Storage Engineer

Python
Linux
Kubernetes
  • Bachelor's degree in Computer Science, Electrical Engineering or related field
  • 8+ years of experience designing and operating large scale storage infrastructure
  • Experience analyzing and tuning performance for AI/HPC workloads
  • Experience with parallel or distributed filesystems (Lustre, GPFS)
  • Proficient in Centos/RHEL and/or Ubuntu Linux distros
  • Python programming and bash scripting skills
  • Experience with cloud storage solutions (AWS, Azure or GCP)
  • Experience with AI/HPC cluster job schedulers (SLURM, LSF)
  • Understanding of container technologies (Docker, Enroot)
  • Experience with AI/HPC workflows using MPI

Benefits For Senior AI-HPC Storage Engineer

Medical Insurance
Equity
  • Competitive salaries
  • Comprehensive benefits package
  • Equity

Interested in this job?

Jobs Related To NVIDIA Senior AI-HPC Storage Engineer

Senior Software Engineer-Distributed Inference

Senior Software Engineer position at NVIDIA focusing on distributed inference and AI performance optimization, offering competitive compensation and remote work options.

Senior HPC Performance Engineer

Senior HPC Performance Engineer role at NVIDIA focusing on GPU Communications Libraries and Networking, optimizing performance for deep learning and HPC applications.

Senior Generalist Software Engineer -- Omniverse

Senior Generalist Software Engineer position at NVIDIA focusing on Omniverse, computer graphics, and compute systems development in Taiwan.

Senior System Software Engineer, NCCL - Partner Enablement

Senior System Software Engineer position at NVIDIA focusing on NCCL partner enablement, combining distributed systems expertise with customer support for AI and HPC applications.

Senior GPU Cluster Software Engineer

Senior GPU Cluster Software Engineer position at NVIDIA, focusing on building profiling solutions for large-scale ML/DL applications on GPU compute clusters.