Site Reliability Engineer, Managed Operations

World's most comprehensive and broadly adopted cloud platform, pioneering cloud computing services.
Site Reliability
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
Enterprise SaaS · Cloud

Description For Site Reliability Engineer, Managed Operations

AWS is launching its first European Sovereign Cloud (ESC), a groundbreaking initiative in Utility Computing. As a Site Reliability Engineer in the AWS Managed Operations team, you'll play a crucial role in building and leading operations for high-availability AWS services like EC2, S3, Dynamo, Lambda, and Bedrock, specifically for EU customers.

The role splits evenly between operating production systems and implementing long-term improvements. You'll be part of AWS Utility Computing (UC), which provides foundational services and continuous product innovations. Your responsibilities include overseeing the ESC launch in 2025, collaborating with global teams, and ensuring optimal service performance.

Working at AWS means joining the world's leading cloud platform provider, where innovation is constant. The company values diverse experiences and fosters an inclusive culture through employee-led affinity groups and ongoing learning opportunities. You'll benefit from extensive mentorship, career growth resources, and a strong work-life harmony philosophy.

The ideal candidate brings experience with modern programming languages (Java, TypeScript, Python, Ruby), Linux systems, and automation. You'll work in Berlin, Germany, with relocation support available within the EU. This role offers the unique opportunity to shape the future of cloud computing in Europe while working with cutting-edge technologies and world-class teams.

Join AWS to be part of a transformative project that combines technical excellence with customer obsession, all while maintaining high standards for security and reliability in cloud computing.

Last updated 3 days ago

Responsibilities For Site Reliability Engineer, Managed Operations

  • Oversee the launch of the European Sovereign Cloud (ESC) in 2025
  • Operate production systems (50% of time)
  • Make long-term improvements to reliability, availability, and performance (50% of time)
  • Root cause analysis of deployment failures
  • Execute highly sensitive time-critical changes to production
  • Participate in design discussions and code reviews
  • Participate in on-call rotations
  • Collaborate with global AWS teams
  • Ensure high-availability experience for EU customers

Requirements For Site Reliability Engineer, Managed Operations

Python
Java
TypeScript
Ruby
Linux
  • Experience in at least one modern programming language such as Java, Typescript, Python, or Ruby
  • Familiarity with Linux, using the command line and basic administration
  • Experience with computer networking fundamentals
  • Experience with scripting and automation
  • Fluency in written and spoken English
  • Legal right to work in Germany

Benefits For Site Reliability Engineer, Managed Operations

Relocation Benefits
Visa Sponsorship
  • Relocation support within EU
  • Mentorship and career growth opportunities
  • Work-life harmony
  • Employee-led affinity groups
  • Inclusive team culture
  • Continuous learning opportunities

Interested in this job?

Jobs Related To Amazon Site Reliability Engineer, Managed Operations

Site Reliability Engineer, ESC Managed Operations

Senior Site Reliability Engineer role at AWS Dublin, leading European Sovereign Cloud operations and development, requiring 3+ years experience in software development and cloud systems.

Site Reliability Engineer, CloudWatch Infrastructure

Senior SRE role at AWS CloudWatch managing large-scale infrastructure automation and monitoring systems, focusing on operational excellence and infrastructure improvement.

Site Reliability Engineer, CloudWatch Infrastructure

Senior SRE role at AWS CloudWatch managing large-scale infrastructure and automation for one of the world's largest monitoring services.

Sr. Site Reliability Engineer, Infrastructure Engineering

Senior Site Reliability Engineer role at Amazon Prime Video, focusing on infrastructure engineering and cloud systems.

Site Reliability Engineer L4/L5 - Live Cloud Platform SRE

Senior Site Reliability Engineer position at Netflix focusing on cloud platform reliability for live streaming events, offering competitive compensation and comprehensive benefits.