AI Infrastructure Engineer

xAI creates AI systems to understand the universe and aid humanity in its pursuit of knowledge.
$180,000 - $440,000
Machine Learning
Senior Software Engineer
In-Person
11 - 50 Employees
5+ years of experience
AI

Description For AI Infrastructure Engineer

xAI is on a mission to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. As an organization, we maintain a flat structure where all employees are hands-on contributors to our mission. Our small, highly motivated team values engineering excellence and intellectual curiosity.

The post-training team plays a crucial role in transforming pre-trained models into steerable, versatile systems capable of addressing real-world challenges. As an AI Infrastructure Engineer, you'll be at the forefront of developing and optimizing frameworks for large-scale machine learning tasks, with a particular focus on reinforcement learning and agent systems.

Your role will involve building high-performance, scalable software that supports cutting-edge AI research. You'll be working on creating efficient training and evaluation frameworks for model fine-tuning, developing large-scale agent simulation systems, and constructing flexible bulk inference frameworks for synthetic data generation.

We're looking for experts in distributed machine learning systems who have deep knowledge of GPUs, Kubernetes, and JAX/PyTorch. You'll be working with cutting-edge technologies including Python, JAX, Rust, and CUDA & NCCL. The role offers an opportunity to push the boundaries of AI capabilities through increased data and computational resources.

Located in the vibrant Bay Area, you'll be part of a team that values strong communication skills and the ability to share knowledge effectively. The position offers competitive compensation ranging from $180,000 to $440,000 USD annually. Our interview process is thorough but efficient, designed to evaluate both technical expertise and cultural fit through coding assessments, technical sessions, and team presentations.

Join us if you're passionate about advancing AI technology, thrive in a fast-paced environment, and want to be part of a team that's pushing the boundaries of what's possible in artificial intelligence.

Last updated 2 months ago

Responsibilities For AI Infrastructure Engineer

  • Building efficient and user-friendly training and evaluation frameworks for model fine-tuning and reinforcement learning
  • Building efficient and user-friendly software frameworks for large-scale agent simulation and execution
  • Building flexible and performant bulking inference framework for synthetic data generation
  • Supporting cutting-edge AI research

Requirements For AI Infrastructure Engineer

Python
Rust
  • Expert in developing software for large-scale distributed machine learning systems
  • Expert in GPUs, Kubernetes, and JAX (or PyTorch)
  • Experience in standard software engineering best practices (CI/CD)
  • Strong communication skills
  • Must be located near the Bay Area or open to relocation

Interested in this job?

Jobs Related To xAI AI Infrastructure Engineer

Reasoning Engineer

Senior Reasoning Engineer position at xAI, focusing on building distributed RL systems and improving AI reasoning capabilities in the San Francisco Bay Area.

Sr Software Engineer

Senior Software Engineer role at Amazon RBKS focusing on AI and computer vision system development for smart home applications, offering competitive compensation and growth opportunities.

Software Development Engineer, Amazon Robotics

Senior Software Engineer role at Amazon Robotics, focusing on ML infrastructure and distributed systems for robotics applications.

Software Development Engineer, Amazon Robotics (AR) Sortation Planning

Senior ML Engineer role at Amazon Robotics focusing on developing and implementing machine learning solutions for robotic sortation systems and workflow optimization.

Senior Delivery Consultant - Application Developer, Data & Machine Learning, WWPS ProServe

Senior Delivery Consultant role at AWS ProServe team focusing on machine learning and data solutions implementation, requiring 5+ years of experience in cloud architecture and ML deployment.