ML Engineer L4, Consumer Inference

Netflix

Netflix is one of the world's leading entertainment services with 278 million paid memberships in over 190 countries enjoying TV series, films and games across a wide variety of genres and languages.

Los Gatos, CA, USA

$100,000 - $464,000

Machine Learning

Mid-Level Software Engineer

In-Person

5,000+ Employees

4+ years of experience

AI · Enterprise SaaS · Entertainment

Description For ML Engineer L4, Consumer Inference

Netflix is seeking a Machine Learning Engineer to join their Machine Learning Platform (MLP) team. This role will focus on bridging the gap between ML research and productization. Key responsibilities include:

Developing customer-facing libraries and services for efficient and scalable ML model inference
Building and maintaining online inference services for real-time predictions
Optimizing and deploying large language models (LLMs) for production environments
Maintaining and improving a model registry for ML model governance
Participating in ML Platform incident management and support

The ideal candidate will have:

Strong programming skills in Python and Java
Experience with ML libraries like TensorFlow and PyTorch
Familiarity with GPU inference optimization tools (e.g., Triton Inference Server, TensorRT)
Knowledge of containerization (Docker) and orchestration (Kubernetes)
Experience in large-scale build, release, CI/CD, and observability techniques
Strong customer focus and excellent communication skills

Netflix offers a unique culture with true transparency and autonomy. The role provides opportunities for impact, responsibility, and continuous learning in a collaborative environment. The company provides comprehensive benefits, including health plans, mental health support, 401(k) with employer match, stock options, and paid time off.

Netflix is committed to diversity and inclusion, providing equal opportunities to all candidates regardless of background.

Last updated 7 months ago

Responsibilities For ML Engineer L4, Consumer Inference

Develop customer-facing libraries and services for ML model inference
Build and maintain online inference services for real-time predictions
Optimize and deploy large language models (LLMs) for production
Maintain and improve a model registry for ML model governance
Participate in ML Platform incident management and support

Requirements For ML Engineer L4, Consumer Inference

Python

Java

Kubernetes

Strong programming skills in Python and Java
Familiarity with ML libraries like TensorFlow and PyTorch
Experience with GPU inference optimization tools (e.g., Triton Inference Server, TensorRT)
Knowledge of containerization (Docker) and orchestration (Kubernetes)
Experience in large-scale build, release, CI/CD, and observability techniques
Strong customer focus and communication skills

Benefits For ML Engineer L4, Consumer Inference

401k

Medical Insurance

Mental Health Assistance

Parental Leave

Equity

Health Plans
Mental Health support
401(k) Retirement Plan with employer match
Stock Option Program
Disability Programs
Health Savings and Flexible Spending Accounts
Family-forming benefits
Life and Serious Injury Benefits
Paid leave of absence programs
35 days annually for paid time off (for hourly employees)
Flexible time off (for salaried employees)

Netflix

Netflix is one of the world's leading entertainment services with 278 million paid memberships in over 190 countries enjoying TV series, films and games across a wide variety of genres and languages.

Los Gatos, CA, USA

$100,000 - $464,000

Machine Learning

Mid-Level Software Engineer

In-Person

5,000+ Employees

4+ years of experience

AI · Enterprise SaaS · Entertainment

Netflix

Implement a stack

Data Structures & AlgorithmsMedium

How would you implement a stack using another data structure?

Stacks

Netflix

How would you design a thread synchronization function for common resources?

System DesignMedium

Design a thread synchronization function to access common resources. Imagine you have multiple threads that need to access and modify a shared data structure, such as a counter or a list. Without proper synchronization, race conditions can occur, leading to unpredictable and incorrect results. For example, two threads might try to increment the same counter simultaneously, resulting in only one increment instead of two. Your task is to design a function or mechanism that allows these threads to safely access and modify the shared resource. Consider the following requirements: Mutual Exclusion: Only one thread should be able to access the shared resource at any given time. Fairness: Threads should not be indefinitely blocked from accessing the resource (no starvation). Efficiency: The synchronization mechanism should not introduce excessive overhead that significantly impacts performance. Explain your design choices, including the type of synchronization primitive you would use (e.g., mutex, semaphore, condition variable), how it would be implemented, and how it would ensure mutual exclusion, fairness, and efficiency. Provide code snippets or pseudocode to illustrate your solution. Discuss potential issues such as deadlocks and how to prevent them. How would your design handle a scenario with a high number of threads contending for the resource?

Database Problems

Bit Manipulation

Greedy Algorithms

Dynamic Programming

Graphs

Recursion

Trees

Linked Lists

Sliding Windows

Binary Search

Stacks

Two Pointers

Strings

Arrays

Netflix

Tell me about a time you went above and beyond in a project.

Behavioral

Tell me about a time you went above and beyond in a project. What was the situation, what actions did you take, and what was the result? For example, perhaps you identified a critical bug that wasn't in your assigned area and took the initiative to fix it, or maybe you proactively automated a manual process that saved the team significant time. Detail the specific steps you took and the impact your actions had on the project's success or the team's efficiency. What did you learn from that experience, and how has it influenced your approach to projects since then? Focus on demonstrating your problem-solving skills, your ability to anticipate needs, and your commitment to delivering exceptional results.

Interested in this job?

Jobs Related To Netflix ML Engineer L4, Consumer Inference

Software Engineer L4, Machine Learning Platform (Metaflow)

Netflix

Mid-level Software Engineer role at Netflix, focusing on Machine Learning Platform development with Metaflow, offering competitive compensation and comprehensive benefits.

Software Engineer (L4), Consumer ML Model Compute & Serving Systems

Netflix

Netflix is hiring a Software Engineer (L4) for their Consumer ML Model Compute & Serving Systems team to develop scalable ML infrastructure and advance AI initiatives.

Software Engineer (L4), Consumer ML Model Compute & Serving Systems

Netflix

Netflix is hiring a Software Engineer (L4) for Consumer ML Model Compute & Serving Systems to build scalable ML infrastructure and advance AI initiatives.

Forward-Deployed AI Engineer

OfferFit

Forward-Deployed AI Engineer position at OfferFit, implementing production-scale AI solutions with focus on reinforcement learning and customer success.

Software Engineer II, Machine Learning Platform

Attentive

Mid-level Software Engineer position focused on building and maintaining ML platform infrastructure at Attentive, offering competitive salary and benefits in San Francisco.