Baseten is a well-funded ML infrastructure company backed by prestigious investors like IVP, Spark Capital, and Greylock. They're seeking a Software Engineer specialized in ML Performance to join their growing team. The role focuses on advancing AI applications, particularly in LLM Inference optimization.
The position offers an exciting opportunity to work with cutting-edge ML technologies and optimization techniques. You'll be responsible for implementing advanced methods like quantization, speculative decoding, and LoRA for ML model inference. The role requires deep technical expertise in ML libraries, GPU architecture, and programming languages like Python and C++.
This is an ideal opportunity for someone passionate about ML performance optimization and eager to make a significant impact in the AI infrastructure space. The company provides a competitive compensation package including equity, healthcare benefits, and unlimited PTO. You'll be working with various ML startups, offering unique learning and networking opportunities.
The work environment is inclusive and supportive, fostering continuous learning and professional growth. You'll be joining at an exciting time as the company has already achieved product-market fit and secured Series B funding, yet still maintains the dynamic and innovative culture of a startup. The role offers the perfect blend of technical challenges, growth opportunities, and the chance to shape the future of AI infrastructure.
Working at Baseten means being at the forefront of ML infrastructure development, collaborating with a talented team, and contributing to products used by category-defining AI companies. The company's commitment to diversity and inclusion, combined with its strong market position and growth trajectory, makes this an exceptional opportunity for engineers passionate about ML performance optimization.