Lead MLOps Engineer

XR is a global technology platform powering the creative economy. Its unified platform moves creative and productions forward, simplifying the fragmentation and delivering global insights that drive increased business value.
Machine Learning
Staff Software Engineer
Hybrid
5+ years of experience
AI · Enterprise SaaS

Description For Lead MLOps Engineer

XR is a global technology platform powering the creative economy. Its unified platform moves creative and productions forward, simplifying the fragmentation and delivering global insights that drive increased business value. XR operates in 130 countries and 45 languages, serving the top global advertisers and enabling $150 billion in video ad spend around the world. More than half a billion creative brand assets are managed in XR's enterprise platform.

The Lead MLOps Engineer plays a critical role in ensuring the seamless integration, deployment, monitoring, and scaling of machine learning models into production. The role blends the expertise of DevOps and machine learning to bridge the gap between data science and operational systems, ensuring that ML models perform reliably and at scale in real-world environments. As the Lead MLOps Engineer, you'll drive best practices for model lifecycle management and create the infrastructure to automate and streamline workflows.

Key responsibilities include:

  • Designing and architecting the AI/ML models platform
  • Building and managing infrastructure for ML model deployment
  • Architecting MLOps systems with tools like AWS Sagemaker, MLFlow, Stepfunctions, Lambdas
  • Leading CI/CD pipeline implementation for model deployment
  • Ensuring scalability and efficiency of models
  • Setting up monitoring and logging solutions
  • Defining and promoting MLOps best practices
  • Providing technical leadership and mentorship
  • Partnering with global engineering teams
  • Collaborating with Data Scientists, DevOps teams, and Product Managers
  • Staying up-to-date with latest MLOps trends and technologies

The ideal candidate will have:

  • MS/BS in Computer Science or related field
  • 5+ years of experience in MLOps, with 2+ years in leadership
  • Proficiency in Python, shell scripting, and ML/computer vision
  • Experience with cloud services, containerization, and CI/CD pipelines
  • Strong problem-solving and analytical skills

Join Extreme Reach to play a crucial role in advancing ML operations and driving innovation in the creative economy.

Last updated 4 months ago

Responsibilities For Lead MLOps Engineer

  • Design and architect the AI/ML models platform to support scalable, efficient, and high-performance machine learning workflows
  • Build and manage infrastructure that supports the deployment of machine learning models, including leveraging cloud services (AWS), CDK, and containerization tools like Docker
  • Architect and develop MLOps systems with tools such as AWS Sagemaker, MLFlow, Stepfunctions, Lambdas
  • Lead the design and implementation of CI/CD pipelines to automate model deployment and rollback processes
  • Ensure scalability and efficiency of the models to handle real-time predictions and batch processing
  • Set up monitoring and logging solutions for tracking the performance of models in production (DataDog, Cloudwatch)
  • Define and promote best practices in MLOps
  • Provide technical leadership and mentorship to MLOps engineers on technologies, and standard processes
  • Partner with the global engineering team to drive cross-functional alignment and ensure seamless integration of AI ML models into wider data ecosystem
  • Work closely with Data Scientists, DevOps teams, and Product Managers to ensure that machine learning models are integrated into business workflows and deployed effectively
  • Stay up-to-date with the latest trends and technologies in MLOps and machine learning deployment and identify opportunities to incorporate new tools or practices to improve efficiency

Requirements For Lead MLOps Engineer

Python
Kubernetes
  • MS/BS in Computer Science or related background preferred
  • 5+ years of experience in MLOps or related roles, with at least 2+ years in a leadership or senior engineering capacity
  • Proven experience leading and mentoring teams, managing multiple stakeholders, and delivering projects on time
  • Proficiency in Python is essential
  • Experience with shell scripting, system diagnostic and automation tooling
  • Proficiency and professional experience of ML and computer vision
  • Have built and deployed ML, computer vision or GenAI solutions (PyTorch, TensorFlow)
  • Experience working with databases to manage the flow of data through the machine learning lifecycle
  • Experience with cloud-native services for machine learning, such as AWS SageMaker, MLFlow, Stepfunctions, Lambdas is essential
  • Deep expertise in Docker for containerization of machine learning models and tools is essential
  • Experience delivering environment using infrastructure-as-code techniques (AWS CDK, CloudFormation)
  • Experience setting up and managing continuous CI/CD pipelines for ML workflows using tools like Jenkins, GitLab
  • Experience in fast-paced, innovative, Agile SDLC
  • Strong problem solving, organization and analytical skills
  • Experience with Databricks is beneficial
  • Experience in building and managing training, evaluation and testing datasets is beneficial
  • Knowledge of security best practices in the context of machine learning

Interested in this job?

Jobs Related To Extreme Reach Lead MLOps Engineer

AI Engineer/Lead AI Engineer

Lead Data Scientist position at Salesforce focusing on AI/ML innovation, requiring 5+ years experience and expertise in modern ML frameworks and leadership skills.

Snr Manager Applied Science (OCI/GenAI)

Senior Manager position leading Applied Science team in cloud computing and generative AI research at Oracle, requiring 10+ years experience and deep expertise in machine learning.

Software Development Snr Manager

Senior Technical Manager position at Oracle leading AI and cloud infrastructure teams, focusing on Gen AI/LLMs and OCI migrations.

Staff Software Engineer, ML Infrastructure

Staff Software Engineer position at Airbnb focusing on building and scaling ML infrastructure and GenAI capabilities to support company-wide AI initiatives.

Staff Machine Learning Engineer, Guest & Host

Staff Machine Learning Engineer position at Airbnb focusing on developing pricing guidance models using reinforcement learning techniques.