Senior Compute Site Reliability Engineer (GPU)

Global technology company that designs, develops, and sells consumer electronics, software, and services.
$135,400 - $250,600
Site Reliability
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
Enterprise SaaS

Description For Senior Compute Site Reliability Engineer (GPU)

Apple is seeking a Senior Compute Site Reliability Engineer to join their Services Engineering team, focusing on GPU infrastructure and cloud services. This role combines traditional SRE responsibilities with specialized GPU-accelerated computing expertise.

The position offers an exciting opportunity to work at one of the world's leading technology companies, where you'll be responsible for maintaining and enhancing SRE practices for a private cloud service. You'll be working with cutting-edge technology, including GPU-accelerated VM infrastructure, Kubernetes clusters, and modern monitoring tools.

The ideal candidate will bring 5+ years of SRE/DevOps experience, along with deep knowledge of GPU infrastructure and cloud platforms. You'll be working in a collaborative environment, interfacing with data scientists, developers, and other stakeholders to build and maintain robust, scalable systems.

Key aspects of the role include designing GPU-accelerated infrastructure, implementing Kubernetes clusters, ensuring security and scalability, and optimizing resource utilization. You'll be involved in critical activities such as capacity planning, scale testing, and disaster recovery exercises.

The compensation package is highly competitive, ranging from $135,400 to $250,600, complemented by comprehensive benefits including medical coverage, stock options, and education reimbursement. Apple's commitment to innovation and excellence makes this an ideal opportunity for someone passionate about infrastructure and GPU computing.

The position is based in Seattle, offering the chance to work in one of the major tech hubs while contributing to Apple's cloud infrastructure. You'll be part of a team that values ownership, clear communication, and continuous learning, making it an excellent opportunity for career growth and technical development.

This role is perfect for someone who combines technical expertise with strong communication skills and a passion for building reliable, scalable systems. You'll have the opportunity to work with modern technologies while solving complex challenges in GPU computing and infrastructure management.

Last updated an hour ago

Responsibilities For Senior Compute Site Reliability Engineer (GPU)

  • Design and deploy GPU-accelerated VM and container infrastructure using platforms such as KVM, Qemu, AWS, or Google Cloud
  • Implement GPU-based Kubernetes clusters to support containerized applications and services
  • Work with data scientists, developers, and other stakeholders to understand requirements
  • Implement best practices for security, scalability, and high availability environments
  • Monitor and optimize resource utilization
  • Participate in capacity planning, scale testing, and disaster recovery exercises
  • Troubleshoot issues across the entire infrastructure stack
  • Maintain relationships with internal and external third-party vendors

Requirements For Senior Compute Site Reliability Engineer (GPU)

Kubernetes
Go
Linux
  • 5+ years in Site Reliability Engineering, DevOps, or Infrastructure focused role
  • Experience with GPU-based virtual machine infrastructure and cloud platforms
  • Experience with GPU hardware and associated software stack
  • Experience with GitOps, CI/CD tools, and deployment strategies
  • Ability to implement telemetry using monitoring tools
  • Outstanding organizational and communications skills
  • BS/MS degree in Engineering or Computer Science or equivalent experience

Benefits For Senior Compute Site Reliability Engineer (GPU)

Medical Insurance
Dental Insurance
Education Budget
Equity
Relocation Benefits
  • Comprehensive medical and dental coverage
  • Retirement benefits
  • Employee stock programs
  • Education reimbursement
  • Discretionary restricted stock unit awards
  • Employee Stock Purchase Plan
  • Discretionary bonuses
  • Relocation assistance
  • Product discounts
  • Free services

Interested in this job?

Jobs Related To Apple Senior Compute Site Reliability Engineer (GPU)

Site Reliability Engineer, Enterprise Technology Services

Senior Site Reliability Engineer role at Apple focusing on enterprise technology services, security, and infrastructure optimization.

Senior Site Reliability Engineer

Senior SRE position at Apple working on satellite communications infrastructure, building and maintaining critical systems for Emergency SOS services.

Site Reliability Engineer (SRE)

Senior SRE position at Apple working on cloud infrastructure for computer vision and ML applications, offering competitive pay and benefits.

Site Reliability Engineer

Senior Site Reliability Engineer role at Apple's Health team focusing on large-scale engineering support, automation, and cloud infrastructure.

Hardware Site Reliability Engineer - Apple Vision Pro

Senior Hardware Site Reliability Engineer role at Apple, focusing on Vision Pro platform, requiring 3+ years SRE experience and strong Linux expertise.