Site Reliability Engineer (Information Technology)

SpaceX is actively developing technologies to enable human life on Mars, founded under the belief that exploring the stars is fundamentally more exciting than not.
Hawthorne, CA, USA
$120,000 - $170,000
Site Reliability
Senior Software Engineer
In-Person
5,000+ Employees
3+ years of experience
Space

Description For Site Reliability Engineer (Information Technology)

SpaceX is seeking an experienced Site Reliability Engineer to join their Information Technology Linux Infrastructure team. This role is crucial in supporting SpaceX's mission to make life multiplanetary through the development and maintenance of large-scale, distributed, fault-tolerant systems.

The position focuses on Kubernetes design, maintenance, scaling, and optimization in support of critical business functions. The ideal candidate will thrive in a fast-paced environment while managing a fleet of Kubernetes clusters at scale. They will be responsible for infusing SRE culture and practices across teams while tackling complex scaling challenges on the journey to Mars.

The role offers a comprehensive benefits package including medical, dental, and vision coverage, 401(k) retirement plan, equity opportunities through stock options and ESPP, and various other perks. Compensation ranges from $120,000 to $170,000 based on experience level, with additional long-term incentives available.

Key responsibilities include managing production Kubernetes installations, collaborating on deployment automation, and ensuring cluster reliability and security. The position requires strong experience with Linux systems, Kubernetes architecture, and Infrastructure as Code practices. Candidates should be prepared for on-call rotations and occasional extended hours.

This is an exciting opportunity to work at the forefront of space technology while building and maintaining critical infrastructure. The role combines technical expertise in site reliability engineering with SpaceX's ambitious goal of enabling human life on Mars. Successful candidates will join a team of passionate engineers working on cutting-edge technology in the space industry.

The position requires ITAR compliance, meaning candidates must be U.S. citizens, permanent residents, or eligible for required authorizations. SpaceX offers a dynamic work environment where engineers can directly impact the company's mission of making humanity multiplanetary.

Last updated 4 days ago

Responsibilities For Site Reliability Engineer (Information Technology)

  • Actively participate in Day-0 to Day-2 operations supporting production Kubernetes installations
  • Collaborate with stakeholders and other Site Reliability Engineers to define requirements for Kubernetes deployments and automation
  • Collaborate with leadership and stakeholders to ensure that our Kubernetes clusters are reliable, scalable and secure
  • Collaborate with peers to develop and maintain automation and PaaS solutions that produce Kubernetes cluster deployments

Requirements For Site Reliability Engineer (Information Technology)

Kubernetes
Linux
  • Bachelor's degree in computer science, information systems, or engineering discipline; OR 3+ years of professional experience with site reliability or DevOps in lieu of a degree
  • Experience with Linux operating systems
  • 3+ years writing and deploying software applications in production
  • 3+ years provisioning and administering Kubernetes clusters in production
  • 3+ years architecting and maintaining tooling for automating Kubernetes cluster installation
  • 3+ years using Infrastructure as Code and GitOps patterns for managing infrastructure
  • In-depth knowledge of Kubernetes architecture and common plugins
  • Willingness to gain and maintain an active security clearance
  • Must be willing to work extended hours and weekends as needed
  • Must be willing to participate in off hours on-call rotation

Benefits For Site Reliability Engineer (Information Technology)

401k
Medical Insurance
Dental Insurance
Vision Insurance
Parental Leave
Equity
  • 3 weeks of paid vacation
  • 10+ paid holidays per year
  • Comprehensive medical, vision, and dental coverage
  • 401(k) retirement plan
  • Short and long-term disability insurance
  • Life insurance
  • Paid parental leave
  • Stock options
  • Employee Stock Purchase Plan
  • Long-term incentives

Interested in this job?

Jobs Related To SpaceX Site Reliability Engineer (Information Technology)

Sr. Site Reliability Engineer - Top Secret Clearance

Senior Site Reliability Engineer position at SpaceX, requiring Top Secret clearance, focusing on infrastructure automation and DevOps practices for space flight systems.

Sr. Site Reliability Engineer (Starshield) - Top Secret Clearance

Senior Site Reliability Engineer position at SpaceX working on Starshield program, requiring Top Secret clearance and expertise in Kubernetes and Linux systems.

Sr. Site Reliability Engineer (Starshield) - Top Secret Clearance

Senior Site Reliability Engineer position at SpaceX working on Starshield program, requiring Top Secret clearance and expertise in Kubernetes and Linux systems.

Sr. Site Reliability Engineer - Top Secret Clearance

Senior Site Reliability Engineer position at SpaceX, requiring Top Secret clearance, focusing on infrastructure automation and DevOps practices for space flight systems.

Senior Software Developer, Site Reliability Engineering, Google Cloud

Senior SRE position at Google Cloud focusing on building and maintaining large-scale distributed systems, requiring 5+ years of software development experience and strong system design skills.