Lead MLOps Engineer

XR is a global technology platform powering the creative economy. Its unified platform moves creative and productions forward, simplifying the fragmentation and delivering global insights that drive increased business value.
Machine Learning
Staff Software Engineer
Hybrid
5+ years of experience
AI · Enterprise SaaS

Description For Lead MLOps Engineer

XR is a global technology platform powering the creative economy. Its unified platform moves creative and productions forward, simplifying the fragmentation and delivering global insights that drive increased business value. XR operates in 130 countries and 45 languages, serving the top global advertisers and enabling $150 billion in video ad spend around the world. More than half a billion creative brand assets are managed in XR's enterprise platform.

The Lead MLOps Engineer plays a critical role in ensuring the seamless integration, deployment, monitoring, and scaling of machine learning models into production. The role blends the expertise of DevOps and machine learning to bridge the gap between data science and operational systems, ensuring that ML models perform reliably and at scale in real-world environments. As the Lead MLOps Engineer, you'll drive best practices for model lifecycle management and create the infrastructure to automate and streamline workflows.

Key responsibilities include:

  • Designing and architecting the AI/ML models platform
  • Building and managing infrastructure for ML model deployment
  • Architecting MLOps systems with tools like AWS Sagemaker, MLFlow, Stepfunctions, Lambdas
  • Leading CI/CD pipeline implementation for model deployment
  • Ensuring scalability and efficiency of models
  • Setting up monitoring and logging solutions
  • Defining and promoting MLOps best practices
  • Providing technical leadership and mentorship
  • Partnering with global engineering teams
  • Collaborating with Data Scientists, DevOps teams, and Product Managers
  • Staying up-to-date with latest MLOps trends and technologies

The ideal candidate will have:

  • MS/BS in Computer Science or related field
  • 5+ years of experience in MLOps, with 2+ years in leadership
  • Proficiency in Python, shell scripting, and ML/computer vision
  • Experience with cloud services, containerization, and CI/CD pipelines
  • Strong problem-solving and analytical skills

Join Extreme Reach to play a crucial role in advancing ML operations and driving innovation in the creative economy.

Last updated 5 months ago

Responsibilities For Lead MLOps Engineer

  • Design and architect the AI/ML models platform to support scalable, efficient, and high-performance machine learning workflows
  • Build and manage infrastructure that supports the deployment of machine learning models, including leveraging cloud services (AWS), CDK, and containerization tools like Docker
  • Architect and develop MLOps systems with tools such as AWS Sagemaker, MLFlow, Stepfunctions, Lambdas
  • Lead the design and implementation of CI/CD pipelines to automate model deployment and rollback processes
  • Ensure scalability and efficiency of the models to handle real-time predictions and batch processing
  • Set up monitoring and logging solutions for tracking the performance of models in production (DataDog, Cloudwatch)
  • Define and promote best practices in MLOps
  • Provide technical leadership and mentorship to MLOps engineers on technologies, and standard processes
  • Partner with the global engineering team to drive cross-functional alignment and ensure seamless integration of AI ML models into wider data ecosystem
  • Work closely with Data Scientists, DevOps teams, and Product Managers to ensure that machine learning models are integrated into business workflows and deployed effectively
  • Stay up-to-date with the latest trends and technologies in MLOps and machine learning deployment and identify opportunities to incorporate new tools or practices to improve efficiency

Requirements For Lead MLOps Engineer

Python
Kubernetes
  • MS/BS in Computer Science or related background preferred
  • 5+ years of experience in MLOps or related roles, with at least 2+ years in a leadership or senior engineering capacity
  • Proven experience leading and mentoring teams, managing multiple stakeholders, and delivering projects on time
  • Proficiency in Python is essential
  • Experience with shell scripting, system diagnostic and automation tooling
  • Proficiency and professional experience of ML and computer vision
  • Have built and deployed ML, computer vision or GenAI solutions (PyTorch, TensorFlow)
  • Experience working with databases to manage the flow of data through the machine learning lifecycle
  • Experience with cloud-native services for machine learning, such as AWS SageMaker, MLFlow, Stepfunctions, Lambdas is essential
  • Deep expertise in Docker for containerization of machine learning models and tools is essential
  • Experience delivering environment using infrastructure-as-code techniques (AWS CDK, CloudFormation)
  • Experience setting up and managing continuous CI/CD pipelines for ML workflows using tools like Jenkins, GitLab
  • Experience in fast-paced, innovative, Agile SDLC
  • Strong problem solving, organization and analytical skills
  • Experience with Databricks is beneficial
  • Experience in building and managing training, evaluation and testing datasets is beneficial
  • Knowledge of security best practices in the context of machine learning

Interested in this job?

Jobs Related To Extreme Reach Lead MLOps Engineer

Senior Manager, Machine Learning Platform

Lead Disney's Machine Learning Platform team building ML infrastructure and tooling for streaming services. Drive technical strategy and team growth while delivering innovative ML solutions.

Sr. Staff Software Engineer, AI Infra

Senior Staff Software Engineer position at LinkedIn focusing on AI infrastructure, distributed systems, and large-scale model training.

Senior Staff Machine Learning Engineer, Security

Senior Staff ML Engineer role at Airbnb focusing on security applications, requiring 12+ years of experience and expertise in ML infrastructure and security domains.

Staff Software Engineer, ML Infrastructure

Staff Software Engineer position at Airbnb focusing on building and scaling ML infrastructure and GenAI capabilities for the platform.

AI Engineering Manager, Enterprise AI

AI Engineering Manager position at LinkedIn leading Enterprise AI initiatives across Hirer, Learning and Enterprise Jobs verticals, managing a team of 6-10 engineers.