Site Reliability Engineer L4/L5 - Cloud Platform SRE

Netflix is one of the world's leading entertainment services with 283 million paid memberships in over 190 countries enjoying TV series, films and games.
$100,000 - $720,000
Site Reliability
Senior Software Engineer
Remote
5,000+ Employees
5+ years of experience
Enterprise SaaS · Entertainment

Description For Site Reliability Engineer L4/L5 - Cloud Platform SRE

Netflix, a global entertainment powerhouse serving 283 million subscribers across 190+ countries, is seeking a Site Reliability Engineer for their Cloud Platform SRE team. This role is crucial in building and maintaining Netflix's massive distributed systems that power their streaming service, live content, and gaming platforms.

As a Cloud Platform SRE, you'll be at the forefront of ensuring Netflix's service reliability at an unprecedented scale. You'll work closely with the observability team to implement sophisticated monitoring solutions and develop tools that prevent incidents and enable real-time failure detection. Your work will directly impact millions of users' streaming experience by maintaining and improving the platform's reliability.

The role offers an opportunity to work with cutting-edge technologies including AWS Cloud Platform, microservices architecture, and real-time analytics tools like Kafka and Presto/Trino. You'll be handling complex distributed systems challenges while collaborating with various teams to implement reliability at scale.

Netflix offers a unique culture that values innovation, freedom, and responsibility. The compensation is highly competitive, with a flexible structure allowing you to choose between salary and stock options. The company provides comprehensive benefits including health plans, mental health support, and a 401(k) with employer match.

This is an excellent opportunity for experienced SREs who want to work on some of the world's largest and most complex distributed systems, making a direct impact on how millions of people consume entertainment globally. The remote work arrangement offers flexibility while being part of a team that's revolutionizing the entertainment industry.

Last updated 2 hours ago

Responsibilities For Site Reliability Engineer L4/L5 - Cloud Platform SRE

  • Drive continual improvement in observability, monitoring, and scalability
  • Develop toolings for automation, execution, and data analysis
  • Discovery and fill gaps in end-to-end reliability
  • Write and review code, develop documentation, and debug complex distributed systems
  • Coordinate and collaborate across multiple stakeholders to implement reliability at scale

Requirements For Site Reliability Engineer L4/L5 - Cloud Platform SRE

Go
JavaScript
Python
Kafka
Linux
  • 5+ years of service reliability/operational experience with large-scale systems
  • Knowledge of and experience with AWS Cloud Platform including API Gateway and Microservices
  • Expert-level knowledge of Unix or Linux systems and TCP/IP network fundamentals
  • Proficient understanding of networking principles, protocols (DNS, TLS, HTTP(s), GRPC)
  • Proficient in programming languages such as Go, Javascript, Python
  • Experience with real-time and BigData analytics (Kafka, time series database, Presto/Trino, Spark SQL)
  • Ability to work in a highly collaborative environment
  • Preferred - B.S. in Computer Science, Electrical or Computer Engineering (or equivalent)

Benefits For Site Reliability Engineer L4/L5 - Cloud Platform SRE

401k
Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Assistance
Equity
  • Health Plans
  • Mental Health support
  • 401(k) Retirement Plan with employer match
  • Stock Option Program
  • Disability Programs
  • Health Savings and Flexible Spending Accounts
  • Family-forming benefits
  • Life and Serious Injury Benefits
  • Paid leave of absence programs
  • 35 days annually for paid time off (hourly employees)
  • Flexible time off (salaried employees)

Interested in this job?

Jobs Related To Netflix Site Reliability Engineer L4/L5 - Cloud Platform SRE

Site Reliability Engineer L4/L5 - Live Cloud Platform SRE

Senior Site Reliability Engineer position at Netflix focusing on cloud platform reliability for live streaming events, offering competitive compensation and comprehensive benefits.

Site Reliability Engineer L4/L5 - Live Streaming Pipeline

Netflix is hiring a Site Reliability Engineer (L4/L5) for their Live Streaming Pipeline team to ensure reliability and drive innovation in live content delivery.

Senior Software Developer, Site Reliability Engineering, Google Cloud

Senior SRE role at Google Cloud focusing on building and maintaining large-scale distributed systems with competitive compensation and growth opportunities.

Senior Software Engineer, Site Reliability Engineering, Google Cloud

Senior Site Reliability Engineer position at Google Cloud, focusing on building and maintaining large-scale distributed systems with emphasis on reliability and automation.

Senior Software Engineer, Site Reliability Engineering, Data Cloud

Senior Site Reliability Engineer role at Google, focusing on building AI-powered infrastructure and maintaining large-scale distributed systems for Google Cloud Platform.