Software Engineer - ML Performance

Baseten

ML infrastructure company backed by top-tier investors, providing production workloads with best-in-class performance for enterprises and AI-native companies.

San Francisco, CA, USA • New York, NY, USA

$150,000 - $250,000

Machine Learning

Mid-Level Software Engineer

Remote

3+ years of experience

AI · Enterprise SaaS

Description For Software Engineer - ML Performance

Baseten is a well-funded ML infrastructure company backed by prestigious investors like IVP, Spark Capital, and Greylock. They're seeking a Software Engineer specialized in ML Performance to join their growing team. The role focuses on advancing AI applications, particularly in LLM Inference optimization.

The position offers an exciting opportunity to work with cutting-edge ML technologies and optimization techniques. You'll be responsible for implementing advanced methods like quantization, speculative decoding, and LoRA for ML model inference. The role requires deep technical expertise in ML libraries, GPU architecture, and programming languages like Python and C++.

This is an ideal opportunity for someone passionate about ML performance optimization and eager to make a significant impact in the AI infrastructure space. The company provides a competitive compensation package including equity, healthcare benefits, and unlimited PTO. You'll be working with various ML startups, offering unique learning and networking opportunities.

The work environment is inclusive and supportive, fostering continuous learning and professional growth. You'll be joining at an exciting time as the company has already achieved product-market fit and secured Series B funding, yet still maintains the dynamic and innovative culture of a startup. The role offers the perfect blend of technical challenges, growth opportunities, and the chance to shape the future of AI infrastructure.

Working at Baseten means being at the forefront of ML infrastructure development, collaborating with a talented team, and contributing to products used by category-defining AI companies. The company's commitment to diversity and inclusion, combined with its strong market position and growth trajectory, makes this an exceptional opportunity for engineers passionate about ML performance optimization.

Last updated 25 days ago

Responsibilities For Software Engineer - ML Performance

Implement, refine, and productionize cutting-edge techniques for ML model inference and infrastructure
Deep dive into underlying codebases to debug ML performance issues
Apply and scale optimization techniques across ML models, particularly large language models
Collaborate with a diverse team to design and implement innovative solutions
Own projects from idea to production

Requirements For Software Engineer - ML Performance

Python

Kubernetes

Bachelor's, Master's, or Ph.D. degree in Computer Science, Engineering, Mathematics, or related field
Experience with Python or C++
Familiarity with LLM optimization techniques
Strong familiarity with ML libraries, especially PyTorch, TensorRT, or TensorRT-LLM
Demonstrated interest and experience in LLM's
Deep understanding of GPU architecture

Benefits For Software Engineer - ML Performance

401k

Medical Insurance

Equity

Unlimited PTO
401k
Covered healthcare premiums
Competitive compensation package
Inclusive and supportive work culture
Learning and growth opportunities
Exposure to various ML startups

Baseten

ML infrastructure company backed by top-tier investors, providing production workloads with best-in-class performance for enterprises and AI-native companies.

San Francisco, CA, USA • New York, NY, USA

$150,000 - $250,000

Machine Learning

Mid-Level Software Engineer

Remote

3+ years of experience

AI · Enterprise SaaS

Interested in this job?

Jobs Related To Baseten Software Engineer - ML Performance

AI Support Engineer

Baseten

AI Support Engineer role at Baseten, focusing on ML model deployment, optimization, and user support

Forward Deployed ML Engineer

Baseten

Forward Deployed ML Engineer position at Baseten, working on ML infrastructure and solutions using Python and AI/ML technologies.

AI Support Engineer

Baseten

AI Support Engineer role at Baseten focusing on ML infrastructure support, troubleshooting, and user assistance.

Forward Deployed ML Engineer

Baseten

Join Baseten as a Forward Deployed ML Engineer to build and optimize AI/ML solutions for enterprise clients using cutting-edge technology.

Research Engineer - AI Safety

Google DeepMind

Research Engineer position at Google DeepMind focusing on AI safety, alignment, and risk mitigation for advanced AI systems including Gemini.