Platform engineer, MLOps

Writer is the full-stack generative AI platform delivering transformative ROI for the world's leading enterprises.
DevOps
Senior Software Engineer
Hybrid
101 - 500 Employees
5+ years of experience

Description For Platform engineer, MLOps

Writer, a leading full-stack generative AI platform, is seeking a Platform engineer, MLOps to join their team in London, UK. This hybrid role is crucial for deploying and managing cutting-edge infrastructure for AI/ML operations. As a Platform engineer, MLOps, you'll collaborate with AI/ML engineers and researchers to develop robust CI/CD pipelines, set up monitoring systems, and maintain large Kubernetes clusters with GPU workloads.

The ideal candidate should have 5+ years of experience building core infrastructure, with expertise in tools like Terraform, Python, Docker, and Kubernetes. Familiarity with cloud platforms (GCP, AWS, Azure) and monitoring tools (Prometheus, Grafana) is essential. You should be comfortable with ambiguity, rapid change, and have a knack for troubleshooting complex systems.

Writer offers a comprehensive benefits package, including generous PTO, medical coverage, paid parental leave, and various stipends for personal development and well-being. This role presents an exciting opportunity to make a significant impact in a dynamic, fast-paced environment at the forefront of AI technology.

Join Writer's team of over 200 employees who think big and move fast, and be part of creating a better future of work. If you're passionate about AI/ML infrastructure and ready to tackle challenging problems in a rapidly evolving field, this role could be perfect for you.

Last updated 5 months ago

Responsibilities For Platform engineer, MLOps

  • Work closely with AI/ML engineers and researchers to design and deploy a CI/CD pipeline
  • Set up and manage monitoring, logging, and alerting systems for training runs and APIs
  • Ensure training environments are consistently available across multiple clusters
  • Develop and manage containerization and orchestration systems
  • Operate and oversee large Kubernetes clusters with GPU workloads
  • Improve reliability, quality, and time-to-market of software solutions
  • Measure and optimize system performance
  • Provide primary operational support and engineering for large-scale distributed software applications

Requirements For Platform engineer, MLOps

Python
Kubernetes
Linux
  • Professional experience with infrastructure as code tools like Terraform
  • Experience with scripting languages such as Python or Bash
  • Experience with cloud platforms such as Google Cloud, AWS or Azure
  • Familiarity with Git and GitHub workflows
  • Familiarity with high-performance, large-scale ML systems
  • Ability to troubleshoot complex systems
  • Proactive in identifying problems, performance bottlenecks, and areas for improvement
  • Comfortable with ambiguity and rapid change
  • 5+ years building core infrastructure
  • Experience running inference clusters at scale
  • Experience operating orchestration systems such as Kubernetes at scale

Benefits For Platform engineer, MLOps

Medical Insurance
Dental Insurance
Vision Insurance
401k
Parental Leave
  • Generous PTO, plus company holidays
  • Medical, dental, and vision coverage for you and your family
  • Paid parental leave for all parents (12 weeks)
  • Fertility and family planning support
  • Early-detection cancer testing through Galleri
  • Flexible spending account and dependent FSA options
  • Health savings account for eligible plans with company contribution
  • Annual work-life stipends for home office setup, cell phone, internet
  • Wellness stipend for gym, massage/chiropractor, personal training, etc.
  • Learning and development stipend
  • Company-wide off-sites and team off-sites
  • Competitive compensation, company stock options and 401k

Interested in this job?

Jobs Related To Writer Platform engineer, MLOps

Platform engineer, MLOps

Senior Platform Engineer role at Writer, focusing on MLOps and AI infrastructure, requiring 5+ years experience in DevOps and ML systems.

Senior Software Engineer, Developer Infrastructure

Senior Software Engineer position at Airbnb focusing on Developer Infrastructure and tooling, offering remote work and competitive compensation.

Software Engineer - DevOps

Senior DevOps Engineer role at BlueCat focusing on AWS, Terraform, and CI/CD pipelines in a hybrid work environment with strong company culture and benefits.

DevOps Engineer

DevOps Engineer position at Plane Software, focusing on cloud infrastructure, CI/CD, and system reliability for an open-source project management platform.

Senior Systems Development Engineer, Enterprise Fleet Integration and Management

Senior Systems Development Engineer role at Google, focusing on infrastructure automation and systems management for enterprise fleet integration.