Staff Software Engineer - ML Platform

A platform making AI accessible to every software developer by providing tools for creating, deploying, and running machine learning models.
$230,000 - $280,000
Cloud
Staff Software Engineer
Hybrid
10+ years of experience
AI

Description For Staff Software Engineer - ML Platform

Replicate is revolutionizing the AI landscape by democratizing access to machine learning capabilities for all software developers. As a Staff Software Engineer on the Platform team, you'll be at the forefront of making generative AI accessible to everyone.

The role involves working with the Platform team that manages the complete lifecycle of ML models - from packaging and deployment to serving, scaling, and monitoring. You'll be building infrastructure that supports thousands of models and handles millions of daily predictions. This position offers a unique opportunity to create innovative solutions where your decisions have direct impact.

The ideal candidate brings extensive experience in platform engineering, with a strong background in building scalable systems and working with technologies like Python, Go, Node.js, Kubernetes, and various databases. You'll be responsible for designing and implementing systems that maximize GPU utilization, handle complex workload distribution, and optimize model inference performance.

Working from Replicate's office in San Francisco's Mission district (minimum 3 days per week), you'll collaborate with talented engineers to solve challenging problems in AI infrastructure. The role offers competitive compensation ($230K-$280K) and the opportunity to work on cutting-edge AI technology.

What makes this role exciting is the chance to work at the intersection of infrastructure and AI, building systems that enable developers worldwide to harness the power of machine learning. You'll be dealing with unique challenges in resource optimization, system reliability, and performance tuning, all while making AI more accessible to the broader developer community.

The position requires strong technical expertise combined with excellent communication skills, as you'll be working closely with various teams to understand needs and translate complex technical concepts into actionable solutions. While direct ML experience isn't required, familiarity with ML platforms and production AI systems would be valuable.

Last updated 12 days ago

Responsibilities For Staff Software Engineer - ML Platform

  • Designing and building deployment and model-serving platform
  • Building technology to operate ML and AI advancements
  • Designing systems to maximize utilization and reliability of Kubernetes clusters and GPUs
  • Owning and optimizing task allocation and queuing across customers
  • Working on model inference optimization through caching, weights management, and runtime optimizations

Requirements For Staff Software Engineer - ML Platform

Python
Go
Node.js
Kubernetes
Redis
PostgreSQL
  • Experience building platforms at scale
  • Experience with complex systems
  • Ability to design and implement developer-friendly APIs
  • Hands-on experience with Kubernetes
  • Strong communication and collaboration skills
  • At least 10 years of full-time software engineering experience

Interested in this job?

Jobs Related To Replicate Staff Software Engineer - ML Platform

Data Center Facility Operations Manager, DCC Communities

Lead data center operations at AWS, managing critical infrastructure and teams to ensure 99.999% uptime while driving innovation in cloud computing.

Staff Software Engineer, Network - Edge

Staff Software Engineer position at Airbnb focusing on cloud infrastructure, network systems, and edge computing, offering remote work with competitive compensation.

Staff Software Engineer, Cloud Infrastructure

Staff Software Engineer position at Airbnb focusing on cloud infrastructure, offering remote work and competitive compensation of $204-254K.

Senior Staff Software Engineer, Network Infrastructure

Senior Staff Software Engineer position at Airbnb focusing on cloud infrastructure, network systems, and edge computing solutions.

Staff Software Engineer, Platform Efficiency, Apple Data Platform - ASE

Staff Software Engineer position at Apple focusing on platform efficiency and cloud infrastructure optimization for the Apple Data Platform team.