CentML is revolutionizing the AI infrastructure landscape with a mission to democratize AI by significantly reducing the costs associated with ML model development and deployment. The company is led by a distinguished team of experts from leading tech companies and is spearheaded by co-founder and CEO Gennady Pekhimenko, a renowned expert in ML systems.
As a Senior Software Engineer in Infrastructure, you will play a pivotal role in shaping the future of ML infrastructure. You'll be responsible for designing and developing the CentML platform's deployment infrastructure, which manages ML training and inference across multiple cloud providers including AWS, GCP, Azure, Coreweave, and OCI. This role combines deep technical expertise in containerization, cloud infrastructure, and GPU technologies with the leadership opportunity to guide a team of engineers.
The position offers an exciting opportunity to work on cutting-edge technology that directly impacts the accessibility of AI technology. You'll be working with state-of-the-art GPU clusters, implementing sophisticated scheduling solutions, and ensuring the platform's scalability and performance. The role requires a strong background in containerized deployment systems, cloud infrastructure, and programming languages like Python, Java, and Go.
Working at CentML means joining a company that values diversity, inclusion, and work-life balance. The company offers competitive benefits including equity options, comprehensive healthcare, and professional development opportunities. Whether you're based in Toronto or San Francisco, you'll be part of a team that's pushing the boundaries of what's possible in AI infrastructure.