Site Reliability Engineer (Information Technology)

SpaceX is actively developing technologies to enable human life on Mars, making humanity a multi-planetary species.
$120,000 - $170,000
Site Reliability
Staff Software Engineer
In-Person
3+ years of experience
Space

Description For Site Reliability Engineer (Information Technology)

SpaceX, a pioneering space exploration company, is seeking a Site Reliability Engineer to join their Information Technology Linux Infrastructure team. This role is crucial in supporting SpaceX's mission to make life multiplanetary through the development and maintenance of large-scale, distributed systems. The ideal candidate will manage Kubernetes clusters at scale, bringing expertise in design, maintenance, and optimization. The position offers competitive compensation ranging from $120,000 to $170,000 annually, along with comprehensive benefits including equity opportunities. The role requires strong technical skills in Kubernetes, Linux, and containerized technologies, with a focus on implementing SRE practices across teams. This is an opportunity to directly contribute to humanity's space exploration goals while working with cutting-edge technology in a fast-paced, innovative environment. The position demands a security clearance-eligible individual willing to participate in on-call rotations and occasional extended hours, demonstrating SpaceX's commitment to maintaining critical infrastructure 24/7.

Last updated a month ago

Responsibilities For Site Reliability Engineer (Information Technology)

  • Actively participate in Day-0 to Day-2 operations supporting production Kubernetes installations
  • Collaborate with stakeholders and other Site Reliability Engineers to define requirements for Kubernetes deployments and automation
  • Collaborate with leadership and stakeholders to ensure that our Kubernetes clusters are reliable, scalable and secure
  • Collaborate with peers to develop and maintain automation and PaaS solutions that produce Kubernetes cluster deployments

Requirements For Site Reliability Engineer (Information Technology)

Kubernetes
Linux
  • Bachelor's degree in computer science, information systems, or engineering discipline; OR 3+ years of professional experience with site reliability or DevOps
  • Experience with Linux operating systems
  • 3+ years writing and deploying software applications in production
  • 3+ years provisioning and administering Kubernetes clusters in production
  • 3+ years architecting and maintaining tooling for automating Kubernetes cluster installation
  • In-depth knowledge of Kubernetes architecture and common plugins
  • Willingness to gain and maintain an active security clearance
  • Must be willing to work extended hours and weekends as needed
  • Must be willing to participate in off hours on-call rotation

Benefits For Site Reliability Engineer (Information Technology)

Medical Insurance
Dental Insurance
Vision Insurance
401k
Equity
Parental Leave
  • Long-term incentives (company stock, stock options, long-term cash awards)
  • Employee Stock Purchase Plan
  • Comprehensive medical, vision, and dental coverage
  • 401(k) retirement plan
  • Short and long-term disability insurance
  • Life insurance
  • Paid parental leave
  • 3 weeks of paid vacation
  • 10+ paid holidays per year

Interested in this job?

Jobs Related To SpaceX Site Reliability Engineer (Information Technology)

Sr Staff Software Engineer, Reliability Engineering

Senior Staff SRE position at Airbnb focusing on building and scaling reliable systems, leading technical strategy, and mentoring teams while working remotely.

Staff Software Engineer, Reliability Engineering

Staff Software Engineer position at Airbnb focusing on Site Reliability Engineering, incident management, and building scalable systems with competitive compensation and remote work options.

Lead Engineer, Product Site Reliability Engineer

Lead Engineer position for Product Site Reliability Engineering at Xero, focusing on building and leading SRE teams to ensure system reliability and observability.

Technical Program Manager, Site Reliability

Technical Program Manager position at Google, leading Site Reliability initiatives for AI, Trust and Security platforms with 8+ years of experience required.

Software Engineering Manager II, Site Reliability Engineering

Lead Site Reliability Engineering team at Google, managing distributed systems and service reliability while mentoring engineers and driving technical excellence.