Infrastructure Engineer

A platform for creating, deploying, and running machine learning models, making AI accessible to every software developer.
$200,000 - $280,000
Cloud
Mid-Level Software Engineer
In-Person
3+ years of experience
AI

Description For Infrastructure Engineer

Replicate is revolutionizing AI accessibility by building the premier platform for creating, deploying, and running machine learning models. As an Infrastructure Engineer on the Platform team, you'll be at the forefront of making generative AI available to developers worldwide.

The role involves managing the complete lifecycle of ML models, from packaging and deployment to serving, scaling, and monitoring. You'll be working with a platform that supports thousands of models and handles millions of daily predictions. This position offers a unique opportunity to build innovative solutions where your decisions have direct impact.

The technical stack includes Python, Go, Node.js, Kubernetes, Terraform, and databases like Redis, Google BigQuery, and PostgreSQL. You'll be working on critical infrastructure components, including multi-regional traffic management, GPU optimization, and sophisticated task allocation systems.

The ideal candidate brings experience in platform development at scale, understanding of complex systems architecture, and proven ability with Kubernetes operations. While ML/AI production experience is a plus, the role focuses on infrastructure rather than model building. Strong communication skills are essential as you'll be collaborating closely with teams and translating complex concepts into actionable insights.

Based in Replicate's Mission district office in San Francisco, this role offers the chance to be part of building a strong in-person culture while working on cutting-edge AI infrastructure. You'll be joining a team dedicated to democratizing AI technology and making it accessible to developers everywhere.

Last updated 2 hours ago

Responsibilities For Infrastructure Engineer

  • Designing and building deployment and model-serving platform
  • Building technology to operate ML and AI advancements
  • Designing systems to maximize utilization and reliability of Kubernetes clusters and GPUs
  • Owning and optimizing task allocation and queuing across customers
  • Working on model inference optimization through caching, weights management, and runtime optimizations

Requirements For Infrastructure Engineer

Python
Go
Node.js
Kubernetes
Redis
PostgreSQL
  • Experience building platforms at scale
  • Experience with complex systems
  • Experience designing and implementing developer-friendly APIs
  • Hands-on experience with Kubernetes
  • Strong communication and collaboration skills
  • At least 3 years of full time software engineering experience

Interested in this job?

Jobs Related To Replicate Infrastructure Engineer

Software Engineer II - Cloud Engineer

Cloud Engineer role at SimpliSafe focusing on cloud infrastructure, DevOps, and application lifecycle management using Kubernetes and modern cloud technologies.

Telco Field Engineer

Lead the design and implementation of OpenStack, Kubernetes, and software-defined networking solutions for telecommunications industry transformation at Canonical.

Network Developer

Network Developer position at Oracle focusing on cloud infrastructure development and network automation for OCI, requiring security clearance and combining networking expertise with programming skills.

Network Developer 3

Network Developer position at Oracle focusing on cloud infrastructure development and automation with competitive salary and benefits.

Network Developer 3

Network Developer position at Oracle focusing on cloud infrastructure development and automation, offering competitive salary and comprehensive benefits.