HPC Engineer, AI Infrastructure

Tesla

Tesla is an automotive and technology company developing electric vehicles, AI systems, and robotics solutions.

San Francisco, CA, USA

$133,440 - $355,920

Cloud

Senior Software Engineer

In-Person

5,000+ Employees

3+ years of experience

AI · Automotive · Robotics

Description For HPC Engineer, AI Infrastructure

Tesla's Supercomputing/AI infrastructure team is at the forefront of developing and maintaining high-performance computing systems that power crucial initiatives like Full-Self-Driving (FSD), Tesla Bot, and Dojo supercomputer. As an HPC Engineer, you'll be instrumental in managing and optimizing the AI infrastructure that enables neural network training at scale. The role combines expertise in Linux systems, GPU computing, and infrastructure automation to support Tesla's ambitious AI and robotics projects.

The position offers a unique opportunity to work with cutting-edge technology in autonomous driving and robotics, while managing hundreds of servers and GPU clusters. You'll be responsible for maintaining and improving the platform that enables Tesla's engineering teams to push the boundaries of AI and machine learning. The role requires both technical depth in HPC systems and the ability to collaborate across teams to ensure smooth operations.

Working at Tesla means joining a team that's revolutionizing multiple industries simultaneously. You'll receive comprehensive benefits including competitive salary, equity opportunities, and excellent healthcare coverage. The role offers significant growth potential as Tesla continues to expand its AI infrastructure and computational capabilities. If you're passionate about high-performance computing and want to contribute to transformative technologies in autonomous driving and robotics, this role presents an exceptional opportunity to make a meaningful impact.

Last updated 2 months ago

Responsibilities For HPC Engineer, AI Infrastructure

Support AI/ML cluster infrastructure on GPU and Dojo platforms
Improve monitoring & self-healing pipelines and security posture
Work with hardware and storage vendors to optimize server, storage and network performance
Performance tuning & OS provisioning on Linux systems
Manage HPC clusters, workloads and applications
Automation and systems engineering
Participate in 24x7 on-call rotation

Requirements For HPC Engineer, AI Infrastructure

Python

Linux

Proficiency with scripting languages such as Python or Bash
Proficiency with Linux & network fundamentals
Experience with configuration management software, systems monitoring & alerting is a plus
Experience with high-throughput low-latency networks, GPU-based computing systems preferred
Experience with Slurm, LSF and storage management of parallel file systems is a plus
Bachelor's Degree in Computer Science, Computer Engineering, Electrical Engineering, Physics or proof of exceptional skills
3+ years of additional equivalent experience or evidence of exceptional ability

Benefits For HPC Engineer, AI Infrastructure

Medical Insurance

Dental Insurance

Vision Insurance

401k

Parental Leave

Commuter Benefits

Medical insurance with $0 payroll deduction options
Family-building, fertility, adoption and surrogacy benefits
Dental and vision plans with $0 paycheck contribution options
Company Paid HSA Contribution
Healthcare and Dependent Care FSA
401(k) with employer match
Employee Stock Purchase Plans
Company paid Basic Life, AD&D, disability insurance
Employee Assistance Program
Sick and Vacation time
Back-up childcare and parenting support
Commuter benefits
Employee discounts and perks program

Tesla

Tesla is an automotive and technology company developing electric vehicles, AI systems, and robotics solutions.

San Francisco, CA, USA

$133,440 - $355,920

Cloud

Senior Software Engineer

In-Person

5,000+ Employees

3+ years of experience

AI · Automotive · Robotics

Interested in this job?

Jobs Related To Tesla HPC Engineer, AI Infrastructure

Sr. Network Engineer, Data Center Engineering

Tesla

Senior Network Engineer position at Tesla, leading data center network design and implementation for autonomous driving and AI systems.

Senior Systems Engineer

Disney

Senior Systems Engineer role at Disney Entertainment, focusing on content delivery infrastructure and streaming services for Disney+, Hulu, and ESPN+.

Sr Software Engineer

Disney

Senior Software Engineer role at Disney Experiences focusing on cloud infrastructure, CI/CD, and platform modernization using Terraform and Python.

Software Developer 4

Oracle

Senior Software Developer role at Oracle focusing on cloud engineering and BI solutions, requiring expertise in Java, Python, and Weblogic technologies.

Senior Software Engineer, Network Infrastructure

Airbnb

Senior Network Infrastructure Engineer role at Airbnb, building and managing cloud-native network systems for global service connectivity and security.