Software Engineer - Platform

CentML develops AI infrastructure to reduce the cost of developing and deploying ML models, enabling widespread AI adoption.
Backend
Mid-Level Software Engineer
Remote
AI

Description For Software Engineer - Platform

CentML is revolutionizing the AI infrastructure landscape with a mission to democratize AI by significantly reducing the costs associated with ML model development and deployment. The company is led by a distinguished team of experts from leading tech companies, including their co-founder and CEO, Gennady Pekhimenko, a renowned ML systems expert with multiple industry accolades.

As a Platform Software Engineer at CentML, you'll be at the forefront of developing their innovative platform that provides cost-effective infrastructure for large-scale ML operations. The role involves working with cutting-edge technologies like the Hidet compiler and DeepView, while building scalable solutions for ML training and inference workloads across GPU clusters.

The position offers an opportunity to work with a team of industry veterans from companies like Amazon, Google, Microsoft Research, Nvidia, Intel, Qualcomm, and IBM. You'll be contributing to a platform that aims to make AI accessible to everyone while solving complex technical challenges in ML infrastructure.

The role combines systems engineering with ML infrastructure, requiring strong technical skills in Python and potentially C++, along with experience in large-scale systems development. The company offers competitive benefits, including equity options, comprehensive healthcare, and a strong focus on professional development and work-life balance.

This is an ideal opportunity for a systems engineer passionate about AI infrastructure and interested in making a significant impact on the accessibility and efficiency of ML development and deployment.

Last updated 16 minutes ago

Responsibilities For Software Engineer - Platform

  • Taking part in the design and development of the CentML platform
  • Designing and building solutions for scheduling large scale ML training and inference workloads on GPU clusters over multiple CSPs
  • Communicate with product teams and define use cases, and develop methodology & benchmarks to evaluate different approaches

Requirements For Software Engineer - Platform

Python
Kubernetes
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • Experience building large scale systems from scratch
  • Strong coding skills (in at least one of Python and C++)
  • Solid fundamentals in computer science and computer engineering topics
  • Prior experience in container-based deployment systems like Kubernetes is a big plus
  • Graduate degree with research experience is a plus

Benefits For Software Engineer - Platform

Equity
Medical Insurance
Dental Insurance
Parental Leave
Education Budget
  • An open and inclusive work environment
  • Employee stock options
  • Best-in-class medical and dental benefits
  • Parental Leave top-up
  • Professional development budget
  • Flexible vacation time

Interested in this job?

Jobs Related To CentML Software Engineer - Platform

Software Engineer - Compiler

Join CentML as a Compiler Software Engineer to develop state-of-the-art compiler technology for ML systems, working with Python/C++ and GPU optimization.

Software Engineer III, Infrastructure, Google Cloud Data Management

Software Engineer III position at Google Cloud focusing on infrastructure and data management, offering competitive compensation and the opportunity to work on large-scale systems.

Software Engineer, Backend (Java)

Backend Software Engineer position at CLEAR, building identity platform services using Java, Kafka, and AWS, with 3+ years experience required.

Software Engineer II

Mid-level Software Engineer position at Quokka, developing web-based software for mobile application security analysis, requiring 4+ years of experience.

Full-Stack Growth Engineer

Full-Stack Growth Engineer position at ElevenLabs, focusing on product-led growth and self-serve customers, combining technical expertise with business impact in voice AI technology.