Site Reliability Engineer, Observability, Infrastructure

Electric vehicle and clean energy company pioneering sustainable transportation and energy solutions.
Site Reliability
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
Automotive

Description For Site Reliability Engineer, Observability, Infrastructure

Tesla is seeking a Site Reliability Engineer to join their observability team, focusing on ensuring the smooth operation and visibility of Tesla's global applications, manufacturing, fleet, and Autopilot platforms. This role is crucial in managing a large-scale Splunk infrastructure processing over 100TB of data daily. The position involves designing and optimizing next-generation logging and analytics platforms, requiring expertise in distributed systems, monitoring, and observability tools. The ideal candidate will work with cross-functional teams, handle complex system architectures, and participate in on-call rotations. Tesla offers comprehensive benefits including medical, dental, vision coverage, 401(k) matching, and stock purchase options. This is an opportunity to contribute to Tesla's mission of accelerating the world's transition to sustainable energy while working with cutting-edge technology and infrastructure at scale.

Last updated a month ago

Responsibilities For Site Reliability Engineer, Observability, Infrastructure

  • Administer large scale Splunk multi-site distributed cluster environments processing 300TB+ daily data
  • Optimize infrastructure performance and streamline operations
  • Design and implement next generation logging infrastructures
  • Build and maintain critical systems for observability platforms
  • Collaborate with cross-functional teams for comprehensive service visibility
  • Troubleshoot performance and access issues
  • Create and maintain technical documentation
  • Configure and manage CI/CD pipelines
  • Participate in 24/7 on-call rotation

Requirements For Site Reliability Engineer, Observability, Infrastructure

Python
Linux
  • Experience in configuring and maintaining Linux systems at scale
  • Expert knowledge of distributed Splunk installations
  • Experience with Prometheus, Grafana, and Node Exporter
  • Experience with cribl stream and edge configuration
  • Strong knowledge in SPL OR SQL query languages
  • Strong knowledge of data administration and pipeline creation
  • Proficient in CI/CD pipelines using Ansible and GitHub Actions
  • Expert skills in Python and shell scripting
  • Experience in logging and analytics platform migration

Benefits For Site Reliability Engineer, Observability, Infrastructure

Medical Insurance
Dental Insurance
Vision Insurance
401k
Mental Health Assistance
Parental Leave
Commuter Benefits
  • Medical plans with $0 payroll deduction
  • Family-building, fertility, adoption and surrogacy benefits
  • Dental and vision plans
  • Company Paid HSA Contribution
  • Healthcare and Dependent Care FSA
  • 401(k) with employer match
  • Employee Stock Purchase Plans
  • Company paid Basic Life, AD&D, disability insurance
  • Employee Assistance Program
  • Sick and Vacation time
  • Back-up childcare
  • Commuter benefits
  • Employee discounts

Interested in this job?

Jobs Related To Tesla Site Reliability Engineer, Observability, Infrastructure

Sr. Site Reliability Engineer, Simulation Cluster Infrastructure

Senior Site Reliability Engineer position at Tesla, focusing on simulation cluster infrastructure and large-scale software systems for electric vehicle development.

Sr. Site Reliability Engineer, VMware, Infrastructure

Senior Site Reliability Engineer position at Tesla, focusing on VMware and Windows infrastructure management with emphasis on automation and system reliability.

Sr. Site Reliability Engineer, Integration Tools

Senior Site Reliability Engineer position at Tesla, focusing on integration tools and platforms for vehicle software systems.

Sr. Site Reliability Engineer, Energy

Senior Site Reliability Engineer position at Tesla, focusing on scaling and maintaining energy IoT infrastructure using Kubernetes, AWS, and modern tech stack.

Sr. Site Reliability Engineer, Energy

Senior Site Reliability Engineer position at Tesla, focusing on energy IoT infrastructure and systems scaling with competitive compensation and comprehensive benefits.