Observability Software Engineer, AI Infrastructure

Electric vehicle and clean energy company pioneering autonomous driving and AI technologies
$133,440 - $292,800
DevOps
Mid-Level Software Engineer
In-Person
3+ years of experience
AI · Automotive

Description For Observability Software Engineer, AI Infrastructure

Tesla's "Insane Visibility" team is seeking an experienced Observability Software Engineer to join their AI Infrastructure division. This role combines DevOps expertise with cutting-edge AI technology, focusing on maintaining and optimizing the observability systems that support Tesla's ambitious autonomous driving and robotics initiatives.

The position offers an exciting opportunity to work at the intersection of autonomous vehicles, artificial intelligence, and infrastructure monitoring. You'll be responsible for designing and implementing comprehensive observability solutions that ensure the reliable operation of Tesla's Full Self-Driving (FSD), Robotaxi, and Tesla Bot applications. This involves creating sophisticated monitoring systems, dashboards, and alerts using industry-standard tools like Grafana, Prometheus, and Splunk.

The ideal candidate will bring 3+ years of DevOps or SRE experience, with a strong foundation in monitoring tools and distributed systems. You'll need expertise in Python scripting, containerization with Kubernetes, and a solid understanding of high-performance computing and GPU architecture. Your role will be crucial in maintaining the performance and reliability of Tesla's AI infrastructure stack.

Working at Tesla means joining a company at the forefront of technological innovation, with a mission to accelerate the world's transition to sustainable energy. The compensation package is highly competitive, ranging from $133,440 to $292,800 annually, plus additional cash and stock awards. Tesla offers comprehensive benefits including premium healthcare with zero payroll deductions, 401(k) matching, and unique perks like the Tesla Babies program.

This role presents an exceptional opportunity to make a significant impact on the future of autonomous technology while working with some of the most advanced AI infrastructure systems in the industry. You'll be part of a team that values innovation, problem-solving, and technical excellence, while contributing to Tesla's mission of sustainable transportation and energy.

Last updated 18 days ago

Responsibilities For Observability Software Engineer, AI Infrastructure

  • Design, develop and maintain observability solutions & tools, including monitoring, logging, and alerting systems
  • Create dashboards & automated alerts using tools such as Grafana, Prometheus, Splunk, Catchpoint
  • Analyze system metrics and logs to identify bottlenecks and optimize application performance
  • Partner with developers, DevOps engineers, and AI Infra teams to integrate observability best practices
  • Assist in troubleshooting and resolving production issues
  • Develop scripts or workflows to automate routine tasks
  • Create and maintain documentation for observability tools, processes, and workflows

Requirements For Observability Software Engineer, AI Infrastructure

Python
Kubernetes
  • 3+ years of experience in software engineering, DevOps, or SRE roles
  • Proficiency in monitoring and visualization tools (Prometheus, Grafana, Splunk, Catchpoint)
  • Strong analytical and troubleshooting skills
  • Working knowledge of High performance computing, Slurm, GPU architecture & Networking
  • Working knowledge of logging systems and distributed tracing frameworks
  • Expertise in scripting languages and configuration management tools
  • Experience with containerized environments and cloud platforms
  • Bachelor's Degree in Computer Science, Software Engineering, or related field
  • Excellent verbal and written communication skills

Benefits For Observability Software Engineer, AI Infrastructure

Medical Insurance
Dental Insurance
Vision Insurance
401k
Parental Leave
Mental Health Assistance
Commuter Benefits
  • Aetna PPO and HSA plans with $0 payroll deduction
  • Family-building, fertility, adoption and surrogacy benefits
  • Dental and vision plans with $0 paycheck contribution
  • Company Paid HSA Contribution
  • Healthcare and Dependent Care Flexible Spending Accounts
  • 401(k) with employer match
  • Employee Stock Purchase Plans
  • Company paid Basic Life, AD&D, short-term and long-term disability insurance
  • Employee Assistance Program
  • Sick and Vacation time
  • Back-up childcare and parenting support resources
  • Weight Loss and Tobacco Cessation Programs
  • Tesla Babies program
  • Commuter benefits
  • Employee discounts and perks program

Interested in this job?

Jobs Related To Tesla Observability Software Engineer, AI Infrastructure

Reliability Engineer, Thermal Systems

Tesla Reliability Engineer position focusing on thermal systems design and testing, offering competitive salary and comprehensive benefits in San Francisco Bay Area.

Reliability Engineer, Chassis Systems, Semi

Tesla Reliability Engineer position for chassis systems focusing on Semi truck development, offering competitive salary and comprehensive benefits.

Reliability Engineer, Power Distribution, Semi

Tesla Reliability Engineer position focusing on power distribution systems for Semi trucks, offering competitive salary and comprehensive benefits.

Reliability Engineer, Cell Qualification

Tesla Reliability Engineer position focusing on battery cell qualification, offering competitive pay and comprehensive benefits in Palo Alto, California.

Build & Release Manager, Tesla Bot

Build & Release Manager position at Tesla, focusing on automation pipelines and firmware deployment for the Tesla Bot project, offering competitive compensation and comprehensive benefits.