Software Engineer - ML Platform

Platform for creating, deploying, and running machine learning models, making AI accessible to every software developer.
$200,000 - $280,000
Cloud
Mid-Level Software Engineer
In-Person
11 - 50 Employees
3+ years of experience
AI

Description For Software Engineer - ML Platform

Replicate is revolutionizing AI accessibility by building the premier platform for creating, deploying, and running machine learning models. As an Infrastructure Engineer on the Platform team, you'll be at the forefront of democratizing AI technology.

The role involves managing the complete lifecycle of ML models, from packaging to deployment, serving, scaling, and monitoring. You'll be working with a system that handles thousands of models and millions of daily predictions. The platform team is responsible for critical infrastructure that powers AI accessibility for developers worldwide.

Key Technical Aspects:

  • Working with Python, Go, Node.js, Kubernetes, and Terraform
  • Managing complex distributed systems and GPU resources
  • Optimizing model inference and deployment processes
  • Handling multi-regional traffic and failover capabilities
  • Working with databases like Redis, Google BigQuery, and PostgreSQL

Company Culture:

  • Engineering-led organization with a focus on technical excellence
  • Strong emphasis on in-person collaboration at their Mission, San Francisco office
  • Team composed of experienced professionals from companies like Docker, Dropbox, GitHub, Heroku, NVIDIA
  • Open-source focused with a commitment to building in public
  • Strong emphasis on API design and infrastructure reliability

The role offers an opportunity to:

  • Shape the future of AI accessibility
  • Work with cutting-edge ML/AI technologies
  • Build scalable systems that impact thousands of developers
  • Join a team of experienced engineers from top tech companies
  • Contribute to making AI technology more accessible and safer

This position is perfect for someone who combines strong infrastructure engineering skills with a passion for democratizing AI technology. You'll be working in a fast-paced environment where your decisions directly impact the platform's success and the broader AI developer community.

Last updated a day ago

Responsibilities For Software Engineer - ML Platform

  • Design and build deployment and model-serving platform
  • Build technology to operate ML and AI advancements
  • Design systems to maximize utilization of Kubernetes clusters and GPUs
  • Optimize task allocation and queuing across diverse customers
  • Speed up model inference through various optimization techniques
  • Work with Platform team on model lifecycle management

Requirements For Software Engineer - ML Platform

Python
Go
Node.js
Kubernetes
Redis
PostgreSQL
  • Experience building platforms at scale
  • Experience with complex systems architecture
  • Ability to design and implement developer-friendly APIs
  • Hands-on experience with Kubernetes
  • Strong communication and collaboration skills
  • At least 3 years of full time software engineering experience

Interested in this job?

Jobs Related To Replicate Software Engineer - ML Platform

Infrastructure Engineer

Infrastructure Engineer role at Replicate, building and maintaining scalable platforms for AI model deployment and serving, based in San Francisco.

Infrastructure Engineer

Infrastructure Engineer position at Zirous, focusing on cloud services and infrastructure support with hybrid work model in West Des Moines, IA.

Systems Development Engineer, ADC2S

AWS Systems Development Engineer role focusing on cloud computing infrastructure and EC2 services for government customers.

Software Engineer 2- Cognitive Service Platform

Software Engineer position at Microsoft's Cognitive Service Platform team, focusing on cloud services and AI solutions with 2+ years experience required.

Cloud Software Engineer

Cloud Software Engineer role at Graphcore developing Kubernetes device plugins for AI accelerator hardware integration