Senior Systems Reliability Operations Engineer

The Walt Disney Company, together with its subsidiaries and affiliates, is a leading diversified international family entertainment and media enterprise that includes three core business segments: Disney Entertainment, ESPN, and Disney Experiences.
Site Reliability
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
Enterprise SaaS

Description For Senior Systems Reliability Operations Engineer

The Disney Technology Operations Command Center (DTOC) is seeking a Senior Systems Reliability Operations Engineer to join their 24x7x365 mission-critical services operation center. This role is responsible for monitoring, identifying, and coordinating with technologists across segments to fine-tune system operations and resolve service interruptions. The SRO Engineer will provide operational oversight and technical leadership, examining IT systems for defects and communicating maintenance schedules and critical events across the company.

Key responsibilities include:

  • Driving efficiency and effectiveness of Event, Incident, Major Incident, Request Fulfillment, and Problem Management processes
  • Implementing and maintaining technology observability and alerting solutions
  • Establishing and maintaining service technology level objectives (SLOs) and service level indicators (SLIs)
  • Proactively identifying, diagnosing, and resolving infrastructure, application, and IT operations issues
  • Developing automation tools and scripts to improve IT operations efficiency
  • Analyzing and publishing operational utilization and service performance metrics
  • Participating in creating and reviewing department procedures and operational readiness plans
  • Performing as incident commander on service outage calls

Required qualifications:

  • BA/BS in Computer Science, Engineering, or related field (or equivalent work experience)
  • 5+ years experience supporting converged infrastructure stacks
  • 5+ years leading incident recovery with multi-disciplined teams
  • 3+ years experience in large IT shared services or outsourced environment
  • Experience with cloud operations (AWS, Google Cloud, or Azure)
  • Strong understanding of Windows, Linux/Unix operating systems, and networking concepts
  • Proficiency in scripting languages (e.g., Python, Bash, Ruby) and automation tools
  • Familiarity with observability, monitoring, and alerting tools
  • ITIL v3 Certification preferred

This role offers the opportunity to work with world-class leaders and drive strategies that keep The Walt Disney Company at the leading edge of entertainment technology.

Last updated 4 months ago

Responsibilities For Senior Systems Reliability Operations Engineer

  • Drive efficiency and effectiveness of Event, Incident, Major Incident, Request Fulfillment, and Problem Management processes
  • Implement and maintain technology observability and alerting solutions
  • Establish and maintain service technology level objectives (SLOs) and service level indicators (SLIs)
  • Proactively identify, diagnose, and resolve infrastructure, application, and IT operations issues
  • Develop automation tools and scripts to improve IT operations efficiency
  • Analyze and publish operational utilization and service performance metrics
  • Participate in creating and reviewing department procedures and operational readiness plans
  • Perform as incident commander on service outage calls

Requirements For Senior Systems Reliability Operations Engineer

Python
Linux
  • BA/BS in Computer Science, Engineering or related field (or equivalent work experience)
  • 5+ years experience supporting converged infrastructure stacks
  • 5+ years leading incident recovery with multi-disciplined teams
  • 3+ years experience in large IT shared services or outsourced environment
  • Experience with cloud operations (AWS, Google Cloud, or Azure)
  • Strong understanding of Windows, Linux/Unix operating systems, and networking concepts
  • Proficiency in scripting languages (e.g., Python, Bash, Ruby) and automation tools
  • Familiarity with observability, monitoring, and alerting tools
  • ITIL v3 Certification preferred

Benefits For Senior Systems Reliability Operations Engineer

Medical Insurance
Dental Insurance
Vision Insurance
401k
Education Budget
  • Health Insurance & Wellbeing
  • Childcare Options
  • Paid Time Off
  • Retirement Programs
  • Tuition Assistance
  • Weekly Pay

Interested in this job?

Jobs Related To Disney Senior Systems Reliability Operations Engineer

Sr Site Reliability Engineer

Senior Site Reliability Engineer position at Disney, focusing on cloud infrastructure, system reliability, and technical leadership with competitive compensation and benefits.

Senior Software Developer, Site Reliability Engineering, Google Cloud

Senior SRE role at Google Cloud focusing on maintaining and optimizing large-scale distributed systems with competitive compensation and growth opportunities.

Senior Software Engineer, Site Reliability Engineering, Google Cloud

Senior Site Reliability Engineer position at Google Cloud, focusing on building and maintaining large-scale distributed systems with emphasis on reliability and automation.

Senior Software Engineer, Site Reliability Engineering, Google Cloud

Senior SRE position at Google Cloud focusing on building and maintaining large-scale distributed systems, requiring 5+ years of software development experience.

Senior Software Engineer, Site Reliability Engineering, Google Cloud

Senior SRE position at Google Cloud focusing on building and maintaining large-scale distributed systems, requiring 5+ years of software development experience.