Sr. Site Reliability Engineer, Simulation Cluster Infrastructure

Tesla is a leading electric vehicle manufacturer revolutionizing sustainable transportation and energy solutions.
$140,000 - $300,000
Site Reliability
Senior Software Engineer
In-Person
5,000+ Employees
3+ years of experience
Automotive

Description For Sr. Site Reliability Engineer, Simulation Cluster Infrastructure

Tesla's Simulations Team is seeking a Senior Site Reliability Engineer to lead their Simulation Cluster Infrastructure. This role is crucial in managing large-scale software infrastructure used in Software-in-the-Loop and Processor-in-the-Loop simulations of Tesla vehicles, which simulate billions of miles of vehicle driving daily. The position offers the opportunity to work with cutting-edge technology and lead major initiatives in bringing up new generation infrastructure.

The ideal candidate will be responsible for implementing distributed observability at scale and leading infrastructure rollout procedures on Kubernetes and Tesla's internal orchestration systems. They will join a team of experienced engineers committed to building robust and reliable systems using modern software development practices.

This role combines technical leadership with hands-on engineering, requiring expertise in Kubernetes, Linux systems, and cloud infrastructure. The position offers competitive compensation ranging from $140,000 to $300,000 annually, plus additional benefits including comprehensive healthcare, 401(k) matching, and stock options.

Working at Tesla means being part of a mission to accelerate the world's transition to sustainable energy. The role offers the unique opportunity to solve complex problems in the embedded software space while contributing to revolutionizing electric vehicle production. The position is based in the San Francisco Bay Area and offers a comprehensive benefits package including medical, dental, and vision insurance, along with various family-friendly policies.

Last updated 3 minutes ago

Responsibilities For Sr. Site Reliability Engineer, Simulation Cluster Infrastructure

  • Lead the bring-up of major software infrastructure initiatives for Software-in-the-Loop simulations at scale
  • Lead the bring-up of Bazel Remote Execution and Content Addressable Storage platforms
  • Design and implement observability infrastructure across multiple distributed systems
  • Influence architectural decisions for scalable, cloud-first service architecture
  • Automate software release roll-out processes to Kubernetes

Requirements For Sr. Site Reliability Engineer, Simulation Cluster Infrastructure

Python
Go
Kubernetes
Linux
  • 3+ years of relevant technical experience
  • Experience with Remote Execution in build systems using Bazel
  • Experience with Linux internals focusing on performance profiling
  • Advanced experience with Kubernetes
  • Proficiency in Python, Rust, Go, and/or C++
  • Experience in data-driven capacity planning
  • Troubleshooting and full-cycle incident response capabilities

Benefits For Sr. Site Reliability Engineer, Simulation Cluster Infrastructure

Medical Insurance
Dental Insurance
Vision Insurance
401k
Parental Leave
Commuter Benefits
  • Medical insurance with $0 payroll deduction options
  • Family-building, fertility, adoption and surrogacy benefits
  • Dental and vision plans
  • Health Savings Account (HSA) with company contribution
  • Healthcare and Dependent Care FSA
  • 401(k) with employer match
  • Employee Stock Purchase Plans
  • Life, AD&D, short-term and long-term disability insurance
  • Employee Assistance Program
  • Sick and Vacation time
  • Back-up childcare
  • Commuter benefits
  • Employee discounts

Interested in this job?

Jobs Related To Tesla Sr. Site Reliability Engineer, Simulation Cluster Infrastructure

Sr. Site Reliability Engineer, Energy

Senior Site Reliability Engineer position at Tesla, focusing on energy IoT applications and infrastructure, offering competitive salary and comprehensive benefits.

Site Reliability Engineer, AI Infrastructure

Senior Site Reliability Engineer position at Tesla, focusing on AI infrastructure maintenance and optimization for autonomous driving and robotics projects.

Sr. Site Reliability Engineer, Dojo

Senior Site Reliability Engineer position at Tesla, focusing on Dojo cluster infrastructure maintenance and optimization with competitive compensation and benefits.

Sr. Site Reliability Engineer, Vehicle Software

Senior SRE position at Tesla leading simulation infrastructure initiatives for vehicle software, offering competitive compensation and comprehensive benefits.

Sr. Site Reliability Engineer, Energy

Senior Site Reliability Engineer position at Tesla, focusing on energy IoT applications and infrastructure, offering competitive salary and comprehensive benefits.