Taro Logo

Incident Management Engineer, AWS Incident Detection and Response

Amazon is the world's most comprehensive cloud platform provider, pioneering cloud computing through AWS (Amazon Web Services).
$80,000 - $120,000
Cloud
Mid-Level Software Engineer
In-Person
5,000+ Employees
3+ years of experience
Enterprise SaaS · Cloud
This job posting may no longer be active. You may be interested in these related jobs instead:
System Engineer, Controls to Cloud Integration

System Engineer position at AWS focused on infrastructure management, requiring expertise in Linux, networking, and automation to support AWS's global cloud infrastructure.

Network Development Engineer, BERE Operations

Network Development Engineer role at AWS BERE Operations team, focusing on next-generation IP networks implementation and maintenance with competitive compensation and benefits.

Amzn Dedicated Cloud Engineer, Infrastructure Systems Solutions

Cloud Engineer role at Amazon Dedicated Cloud supporting U.S. Intelligence Community, requiring TS/SCI clearance and expertise in cloud infrastructure and Linux systems.

Systems Development Engineer II, Deployment Readiness Tech Transformation

Systems Development Engineer II position at Amazon focusing on cloud platform development for robotics deployment and point cloud data processing.

Production Workflow Engineer, Production Technology Engineering

Production Workflow Engineer position at Amazon MGM Studios, focusing on cloud-based media production workflows and technology implementation for film and television content creation.

Description For Incident Management Engineer, AWS Incident Detection and Response

Amazon Web Services (AWS) is seeking an Incident Management Engineer to join their Incident Detection and Response team within AWS Support. This role is crucial in providing 24x7 monitoring and incident management for AWS Enterprise Support customers. The position involves working with critical workloads, developing custom runbooks, and ensuring rapid response to incidents within 5 minutes of critical alarms.

The ideal candidate will possess strong analytical skills, technical expertise, and excellent communication abilities to work effectively with customers at all levels. This role requires a blend of tactical and strategic execution, with responsibilities including incident resolution, customer engagement, and continuous improvement of operational processes.

Working in Dublin, the role follows a "follow-the-sun" schedule with core hours of 7am to 3pm GMT or 8am to 4pm GMT+1. Weekend and holiday rotations are part of the role. The position offers opportunities for growth within AWS's dynamic environment, where innovation and customer obsession are core values.

The role combines technical expertise with customer service, requiring skills in cloud computing, virtualization, and incident management. You'll work with leading companies building mission-critical applications on AWS, making this an excellent opportunity for someone passionate about technology and customer success.

AWS values diversity and inclusion, offering a supportive environment with employee-led affinity groups, ongoing learning opportunities, and a strong focus on work-life harmony. The company's commitment to being Earth's Best Employer means continuous investment in employee development and career growth.

Last updated 4 months ago

Responsibilities For Incident Management Engineer, AWS Incident Detection and Response

  • Drive the resolution of large scale customer impacting incidents
  • Drive critical, complex customer escalations
  • Provide critical incident response/management
  • Contribute to Problem Records for customers
  • Conduct continuous real-time proactive monitoring of customer metrics
  • Monitor and manage communications during high impact events
  • Lead projects and virtual teams to drive operational improvements
  • Create and review documentation
  • Identify and troubleshoot recurring platform issues
  • Mentor peers in technical and operational areas

Requirements For Incident Management Engineer, AWS Incident Detection and Response

Python
JavaScript
Linux
  • 1+ year of experience in a similar role
  • 2+ years of virtualization, orchestration and cloud computing experience
  • 1+ year of network and operating system support experience
  • Bachelor's degree in computer science or equivalent, or 3+ years of technical support experience
  • Experience with data manipulation and/or automation using Python, JavaScript or shell scripting
  • Effective prioritization and time management skills

Benefits For Incident Management Engineer, AWS Incident Detection and Response

  • Work-life harmony
  • Flexible working culture
  • Mentorship and career growth opportunities
  • Employee-led affinity groups
  • Inclusive team culture

Interested in this job?