HPC Engineer, AI Infrastructure

Tesla is an automotive and technology company developing electric vehicles, AI systems, and robotics solutions.
$133,440 - $355,920
Cloud
Senior Software Engineer
In-Person
5,000+ Employees
3+ years of experience
AI · Automotive · Robotics

Description For HPC Engineer, AI Infrastructure

Tesla's Supercomputing/AI infrastructure team is at the forefront of developing and maintaining high-performance computing systems that power crucial initiatives like Full-Self-Driving (FSD), Tesla Bot, and Dojo supercomputer. As an HPC Engineer, you'll be instrumental in managing and optimizing the AI infrastructure that enables neural network training at scale. The role combines expertise in Linux systems, GPU computing, and infrastructure automation to support Tesla's ambitious AI and robotics projects.

The position offers a unique opportunity to work with cutting-edge technology in autonomous driving and robotics, while managing hundreds of servers and GPU clusters. You'll be responsible for maintaining and improving the platform that enables Tesla's engineering teams to push the boundaries of AI and machine learning. The role requires both technical depth in HPC systems and the ability to collaborate across teams to ensure smooth operations.

Working at Tesla means joining a team that's revolutionizing multiple industries simultaneously. You'll receive comprehensive benefits including competitive salary, equity opportunities, and excellent healthcare coverage. The role offers significant growth potential as Tesla continues to expand its AI infrastructure and computational capabilities. If you're passionate about high-performance computing and want to contribute to transformative technologies in autonomous driving and robotics, this role presents an exceptional opportunity to make a meaningful impact.

Last updated 2 months ago

Responsibilities For HPC Engineer, AI Infrastructure

  • Support AI/ML cluster infrastructure on GPU and Dojo platforms
  • Improve monitoring & self-healing pipelines and security posture
  • Work with hardware and storage vendors to optimize server, storage and network performance
  • Performance tuning & OS provisioning on Linux systems
  • Manage HPC clusters, workloads and applications
  • Automation and systems engineering
  • Participate in 24x7 on-call rotation

Requirements For HPC Engineer, AI Infrastructure

Python
Linux
  • Proficiency with scripting languages such as Python or Bash
  • Proficiency with Linux & network fundamentals
  • Experience with configuration management software, systems monitoring & alerting is a plus
  • Experience with high-throughput low-latency networks, GPU-based computing systems preferred
  • Experience with Slurm, LSF and storage management of parallel file systems is a plus
  • Bachelor's Degree in Computer Science, Computer Engineering, Electrical Engineering, Physics or proof of exceptional skills
  • 3+ years of additional equivalent experience or evidence of exceptional ability

Benefits For HPC Engineer, AI Infrastructure

Medical Insurance
Dental Insurance
Vision Insurance
401k
Parental Leave
Commuter Benefits
  • Medical insurance with $0 payroll deduction options
  • Family-building, fertility, adoption and surrogacy benefits
  • Dental and vision plans with $0 paycheck contribution options
  • Company Paid HSA Contribution
  • Healthcare and Dependent Care FSA
  • 401(k) with employer match
  • Employee Stock Purchase Plans
  • Company paid Basic Life, AD&D, disability insurance
  • Employee Assistance Program
  • Sick and Vacation time
  • Back-up childcare and parenting support
  • Commuter benefits
  • Employee discounts and perks program

Interested in this job?

Jobs Related To Tesla HPC Engineer, AI Infrastructure

Sr. Network Engineer, Data Center Engineering

Senior Network Engineer position at Tesla, leading data center network design and implementation for autonomous driving and AI systems.

Senior Systems Engineer

Senior Systems Engineer role at Disney Entertainment, focusing on content delivery infrastructure and streaming services for Disney+, Hulu, and ESPN+.

Sr Software Engineer

Senior Software Engineer role at Disney Experiences focusing on cloud infrastructure, CI/CD, and platform modernization using Terraform and Python.

Software Developer 4

Senior Software Developer role at Oracle focusing on cloud engineering and BI solutions, requiring expertise in Java, Python, and Weblogic technologies.

Senior Software Engineer, Network Infrastructure

Senior Network Infrastructure Engineer role at Airbnb, building and managing cloud-native network systems for global service connectivity and security.