Senior Manager - Storage Production Engineering

World leader in accelerated computing, pioneering AI and digital twins technology.
$272,000 - $425,500
Cloud
Staff Software Engineer
In-Person
10+ years of experience
AI · Enterprise SaaS

Description For Senior Manager - Storage Production Engineering

NVIDIA is seeking a Senior Manager for their Storage Production Engineering team to lead Site Reliability Engineering (SRE) initiatives. This role combines technical leadership with people management, focusing on designing and maintaining large-scale production systems with an emphasis on storage solutions.

The position requires a seasoned professional with 10+ years of experience, including 5+ years in management, who can bridge the gap between technical excellence and team leadership. You'll be responsible for overseeing critical storage infrastructure that supports NVIDIA's GPU cloud services, both internal and external, working with cutting-edge technologies including cloud-native storage solutions and Kubernetes.

As a leader in this role, you'll drive strategic initiatives to enhance storage system reliability and performance while managing a team of Storage SRE professionals. Your responsibilities span from technical architecture decisions to team development, incident response management, and implementing automation solutions for improved efficiency.

The role offers an exciting opportunity to work at the forefront of AI computing, as NVIDIA continues to revolutionize parallel computing and deep learning. You'll be part of a company that invented the GPU and is now leading the AI computing revolution. The position comes with competitive compensation ($272,000 - $425,500) plus equity and benefits.

This is an ideal role for someone who combines deep technical knowledge of storage systems with strong leadership capabilities, and who is passionate about building and maintaining robust, scalable infrastructure. You'll work in a collaborative environment that values innovation and technical excellence, with the opportunity to make a significant impact on systems that power some of the most advanced AI and ML solutions in the world.

Last updated 2 hours ago

Responsibilities For Senior Manager - Storage Production Engineering

  • Lead and mentor a team of Storage SRE professionals
  • Formulate and execute strategic initiatives for storage systems
  • Supervise planning, execution, and enhancement of storage solutions
  • Oversee incident response and resolution for storage-related issues
  • Conduct capacity planning and storage demand forecasting
  • Drive automation initiatives for storage operations
  • Implement continuous improvement processes
  • Collaborate with multi-functional teams for system optimization

Requirements For Senior Manager - Storage Production Engineering

Kubernetes
  • Master's degree in Computer Science, Information Technology, or related field or equivalent experience
  • 10+ overall years of relevant experience and 5+ years of management experience
  • In-depth knowledge of storage technologies, file systems, and cloud-based storage solutions
  • Strong leadership and people management skills
  • Exceptional analytical and problem-solving skills
  • Prior engineering experience with hands-on coding background in storage systems
  • Proficiency in scripting and automation tools

Benefits For Senior Manager - Storage Production Engineering

Equity
  • Equity
  • Benefits package available (see nvidia.com/benefits)

Interested in this job?

Jobs Related To NVIDIA Senior Manager - Storage Production Engineering

Applied Science Research Lab Manager

Lead the development and management of next-generation supercomputing clusters at NVIDIA, overseeing technical operations and team leadership in scientific computing research.

Technical Marketing Engineer, DGX Cloud

Technical Marketing Engineer position at NVIDIA focusing on DGX Cloud platform, combining cloud expertise with technical content creation and customer education.

Staff Software Engineer - End-User Compute Platform

Staff Software Engineer position at NVIDIA focusing on cloud desktop platform development, offering competitive salary and opportunity to work on cutting-edge technology.

Software Engineering Manager - Cloud Infrastructure Services, DGX Cloud

Lead Site Reliability Engineering team for NVIDIA's DGX Cloud Computing platform, managing observability and infrastructure operations.

Senior Manager - Compute Infrastructure Engineering

Lead NVIDIA's Compute Infrastructure Engineering team, driving innovation in cloud, containerization, and infrastructure automation while managing critical IT services and transformational initiatives.