Sr. Site Reliability Engineer, Simulation Cluster Infrastructure

Tesla

Tesla is a leading electric vehicle manufacturer revolutionizing sustainable transportation and energy solutions.

San Francisco, CA, USA

$140,000 - $300,000

Site Reliability

Senior Software Engineer

In-Person

5,000+ Employees

3+ years of experience

Automotive

Description For Sr. Site Reliability Engineer, Simulation Cluster Infrastructure

Tesla's Simulations Team is seeking a Senior Site Reliability Engineer to lead their Simulation Cluster Infrastructure. This role is crucial in managing large-scale software infrastructure that simulates billions of miles of vehicle driving daily. The position offers an exciting opportunity to work on cutting-edge technology in the electric vehicle industry.

The role involves leading major initiatives such as implementing new generation Processor-in-the-Loop cluster infrastructure, managing Bazel/Buck Remote Execution clusters, and establishing distributed observability at scale. You'll be working with Kubernetes and Tesla's internal orchestration systems to build robust and reliable infrastructure.

As a Sr. SRE, you'll be joining a team of expert engineers dedicated to revolutionizing electric vehicle production through advanced software development. The position requires strong experience with Linux internals, Kubernetes, and proficiency in languages like Python, Rust, or Go. You'll be responsible for designing and implementing observability infrastructure, automating deployment processes, and ensuring scalable cloud-first architecture.

The compensation package is highly competitive, ranging from $140,000 to $300,000 annually, plus additional cash and stock awards. Tesla offers comprehensive benefits including medical, dental, and vision coverage, 401(k) matching, stock purchase plans, and various family-friendly benefits. This is an excellent opportunity for an experienced SRE to make a significant impact in the automotive industry while working with cutting-edge technology and infrastructure.

Last updated 2 months ago

Responsibilities For Sr. Site Reliability Engineer, Simulation Cluster Infrastructure

Lead the bring-up of major software infrastructure initiatives for Software-in-the-Loop simulations at scale
Lead the bring-up of Bazel Remote Execution and Content Addressable Storage platforms
Design and implement observability infrastructure across multiple distributed systems
Influence architectural decisions for scalable, cloud-first service architecture
Automate software release roll-out processes to Kubernetes

Requirements For Sr. Site Reliability Engineer, Simulation Cluster Infrastructure

Python

Rust

Kubernetes

Linux

3+ years of relevant technical experience
Experience with Remote Execution in build systems using Bazel
Experience with Linux internals focusing on performance profiling
Advanced experience with Kubernetes
Proficiency in Python, Rust, Go, and/or C++
Experience in data-driven capacity planning
Troubleshooting and full-cycle incident response capabilities

Benefits For Sr. Site Reliability Engineer, Simulation Cluster Infrastructure

401k

Medical Insurance

Dental Insurance

Vision Insurance

Mental Health Assistance

Parental Leave

Commuter Benefits

Medical plans with $0 payroll deduction
Family-building, fertility, adoption and surrogacy benefits
Dental and vision plans
HSA with company contribution
Healthcare and Dependent Care FSA
401(k) with employer match
Employee Stock Purchase Plans
Life, AD&D, short-term and long-term disability insurance
Employee Assistance Program
Sick and Vacation time
Back-up childcare
Commuter benefits
Employee discounts

Tesla

Tesla is a leading electric vehicle manufacturer revolutionizing sustainable transportation and energy solutions.

San Francisco, CA, USA

$140,000 - $300,000

Site Reliability

Senior Software Engineer

In-Person

5,000+ Employees

3+ years of experience

Automotive

Interested in this job?

Jobs Related To Tesla Sr. Site Reliability Engineer, Simulation Cluster Infrastructure

Sr. Site Reliability Engineer, VMware, Infrastructure

Tesla

Senior Site Reliability Engineer position at Tesla, focusing on VMware and Windows infrastructure management with emphasis on automation and system reliability.

Sr. Site Reliability Engineer, Energy

Tesla

Senior Site Reliability Engineer position at Tesla, focusing on scaling and maintaining energy IoT infrastructure using Kubernetes, AWS, and modern tech stack.

Sr. Site Reliability Engineer, Energy

Tesla

Senior Site Reliability Engineer position at Tesla, focusing on energy IoT infrastructure and systems scaling with competitive compensation and comprehensive benefits.

Site Reliability Engineer, AI Infrastructure

Tesla

Senior Site Reliability Engineer position at Tesla, focusing on AI infrastructure maintenance and optimization for autonomous driving and robotics projects.

Sr. Site Reliability Engineer, Dojo

Tesla

Senior Site Reliability Engineer position at Tesla, focusing on Dojo cluster infrastructure maintenance and optimization with competitive compensation and comprehensive benefits.