Senior Systems Reliability Operations Engineer

The Walt Disney Company, together with its subsidiaries and affiliates, is a leading diversified international family entertainment and media enterprise that includes three core business segments: Disney Entertainment, ESPN, and Disney Experiences.
Site Reliability
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
Enterprise SaaS

Description For Senior Systems Reliability Operations Engineer

The Disney Technology Operations Command Center (DTOC) is seeking a Senior Systems Reliability Operations Engineer to join their 24x7x365 mission-critical services operation center. This role is responsible for monitoring, identifying, and coordinating with technologists across segments to fine-tune system operations and resolve service interruptions. The SRO Engineer will provide operational oversight and technical leadership, examining IT systems for defects and communicating maintenance schedules and critical events across the company.

Key responsibilities include:

  • Driving efficiency and effectiveness of Event, Incident, Major Incident, Request Fulfillment, and Problem Management processes
  • Implementing and maintaining technology observability and alerting solutions
  • Establishing and maintaining service technology level objectives (SLOs) and service level indicators (SLIs)
  • Proactively identifying, diagnosing, and resolving infrastructure, application, and IT operations issues
  • Developing automation tools and scripts to improve IT operations efficiency
  • Analyzing and publishing operational utilization and service performance metrics
  • Participating in creating and reviewing department procedures and operational readiness plans
  • Performing as incident commander on service outage calls

Required qualifications:

  • BA/BS in Computer Science, Engineering, or related field (or equivalent work experience)
  • 5+ years experience supporting converged infrastructure stacks
  • 5+ years leading incident recovery with multi-disciplined teams
  • 3+ years experience in large IT shared services or outsourced environment
  • Experience with cloud operations (AWS, Google Cloud, or Azure)
  • Strong understanding of Windows, Linux/Unix operating systems, and networking concepts
  • Proficiency in scripting languages (e.g., Python, Bash, Ruby) and automation tools
  • Familiarity with observability, monitoring, and alerting tools
  • ITIL v3 Certification preferred

This role offers the opportunity to work with world-class leaders and drive strategies that keep The Walt Disney Company at the leading edge of entertainment technology.

Last updated 5 months ago

Responsibilities For Senior Systems Reliability Operations Engineer

  • Drive efficiency and effectiveness of Event, Incident, Major Incident, Request Fulfillment, and Problem Management processes
  • Implement and maintain technology observability and alerting solutions
  • Establish and maintain service technology level objectives (SLOs) and service level indicators (SLIs)
  • Proactively identify, diagnose, and resolve infrastructure, application, and IT operations issues
  • Develop automation tools and scripts to improve IT operations efficiency
  • Analyze and publish operational utilization and service performance metrics
  • Participate in creating and reviewing department procedures and operational readiness plans
  • Perform as incident commander on service outage calls

Requirements For Senior Systems Reliability Operations Engineer

Python
Linux
  • BA/BS in Computer Science, Engineering or related field (or equivalent work experience)
  • 5+ years experience supporting converged infrastructure stacks
  • 5+ years leading incident recovery with multi-disciplined teams
  • 3+ years experience in large IT shared services or outsourced environment
  • Experience with cloud operations (AWS, Google Cloud, or Azure)
  • Strong understanding of Windows, Linux/Unix operating systems, and networking concepts
  • Proficiency in scripting languages (e.g., Python, Bash, Ruby) and automation tools
  • Familiarity with observability, monitoring, and alerting tools
  • ITIL v3 Certification preferred

Benefits For Senior Systems Reliability Operations Engineer

Medical Insurance
Dental Insurance
Vision Insurance
401k
Education Budget
  • Health Insurance & Wellbeing
  • Childcare Options
  • Paid Time Off
  • Retirement Programs
  • Tuition Assistance
  • Weekly Pay

Interested in this job?

Jobs Related To Disney Senior Systems Reliability Operations Engineer

Sr. Systems Reliability Engineer

Senior SRE position at Disney Imagineering focusing on building and maintaining reliable systems for theme park experiences and attractions.

Sr. System Reliability Engineer

Senior SRE position at Disney, focusing on system reliability, automation, and digital platform integration with competitive compensation and benefits.

Sr. System Reliability Engineer

Senior SRE position at Disney focusing on system reliability, automation, and infrastructure management for enterprise-scale applications.

Senior Site Reliability Engineer

Senior SRE position at Microsoft Azure Data team, focusing on service reliability and automation for Azure Cosmos DB, offering hybrid work in Vancouver.

Linux Site Reliability Engineer - Cork, Ireland

Senior Site Reliability Engineer role at Qualcomm focusing on Linux systems management, automation, and development of scalable solutions.