Tesla's Supercomputing/AI infrastructure team is at the forefront of developing and maintaining critical infrastructure for machine learning operations, supporting crucial projects like Autopilot, Tesla Bot, and the Dojo supercomputer. As a Site Reliability Engineer, you'll be instrumental in managing and optimizing the AI infrastructure that powers Tesla's ambitious autonomous driving and robotics initiatives.
The role combines high-performance computing expertise with site reliability engineering, requiring skills in Python, Golang, and Linux systems. You'll be working with cutting-edge technology, including GPU clusters and the Dojo platform, while ensuring the reliability and efficiency of systems that enable neural network training at scale.
This position offers an exceptional opportunity to impact the future of autonomous driving and robotics technology. You'll be part of a team that's pushing the boundaries of what's possible in AI infrastructure, working on projects that directly contribute to Tesla's mission of accelerating the world's transition to sustainable energy and autonomous systems.
The compensation package is highly competitive, ranging from $120,000 to $300,000 annually, plus additional cash and stock awards. Tesla offers comprehensive benefits including medical, dental, and vision coverage, 401(k) matching, and various family-friendly benefits. The role is based in the San Francisco Bay Area, putting you at the heart of Tesla's innovation hub.
This is an ideal position for a seasoned engineer who wants to work on challenging technical problems at scale, with direct impact on Tesla's most ambitious projects in autonomous driving and robotics. The role requires both technical expertise and operational excellence, offering significant growth opportunities in the rapidly evolving field of AI infrastructure.