Software Engineer - ML Performance

ML infrastructure company backed by top-tier investors, providing production workloads with best-in-class performance for enterprises and AI-native companies.
$150,000 - $250,000
Machine Learning
Mid-Level Software Engineer
Remote
3+ years of experience
AI · Enterprise SaaS

Description For Software Engineer - ML Performance

Baseten is a well-funded ML infrastructure company backed by prestigious investors like IVP, Spark Capital, and Greylock. They're seeking a Software Engineer specialized in ML Performance to join their growing team. The role focuses on advancing AI applications, particularly in LLM Inference optimization.

The position offers an exciting opportunity to work with cutting-edge ML technologies and optimization techniques. You'll be responsible for implementing advanced methods like quantization, speculative decoding, and LoRA for ML model inference. The role requires deep technical expertise in ML libraries, GPU architecture, and programming languages like Python and C++.

This is an ideal opportunity for someone passionate about ML performance optimization and eager to make a significant impact in the AI infrastructure space. The company provides a competitive compensation package including equity, healthcare benefits, and unlimited PTO. You'll be working with various ML startups, offering unique learning and networking opportunities.

The work environment is inclusive and supportive, fostering continuous learning and professional growth. You'll be joining at an exciting time as the company has already achieved product-market fit and secured Series B funding, yet still maintains the dynamic and innovative culture of a startup. The role offers the perfect blend of technical challenges, growth opportunities, and the chance to shape the future of AI infrastructure.

Working at Baseten means being at the forefront of ML infrastructure development, collaborating with a talented team, and contributing to products used by category-defining AI companies. The company's commitment to diversity and inclusion, combined with its strong market position and growth trajectory, makes this an exceptional opportunity for engineers passionate about ML performance optimization.

Last updated 25 days ago

Responsibilities For Software Engineer - ML Performance

  • Implement, refine, and productionize cutting-edge techniques for ML model inference and infrastructure
  • Deep dive into underlying codebases to debug ML performance issues
  • Apply and scale optimization techniques across ML models, particularly large language models
  • Collaborate with a diverse team to design and implement innovative solutions
  • Own projects from idea to production

Requirements For Software Engineer - ML Performance

Python
Kubernetes
  • Bachelor's, Master's, or Ph.D. degree in Computer Science, Engineering, Mathematics, or related field
  • Experience with Python or C++
  • Familiarity with LLM optimization techniques
  • Strong familiarity with ML libraries, especially PyTorch, TensorRT, or TensorRT-LLM
  • Demonstrated interest and experience in LLM's
  • Deep understanding of GPU architecture

Benefits For Software Engineer - ML Performance

401k
Medical Insurance
Equity
  • Unlimited PTO
  • 401k
  • Covered healthcare premiums
  • Competitive compensation package
  • Inclusive and supportive work culture
  • Learning and growth opportunities
  • Exposure to various ML startups

Interested in this job?

Jobs Related To Baseten Software Engineer - ML Performance

AI Support Engineer

AI Support Engineer role at Baseten, focusing on ML model deployment, optimization, and user support

Forward Deployed ML Engineer

Forward Deployed ML Engineer position at Baseten, working on ML infrastructure and solutions using Python and AI/ML technologies.

AI Support Engineer

AI Support Engineer role at Baseten focusing on ML infrastructure support, troubleshooting, and user assistance.

Forward Deployed ML Engineer

Join Baseten as a Forward Deployed ML Engineer to build and optimize AI/ML solutions for enterprise clients using cutting-edge technology.

Research Engineer - AI Safety

Research Engineer position at Google DeepMind focusing on AI safety, alignment, and risk mitigation for advanced AI systems including Gemini.