Sr. Hardware Reliability Engineer, Infrastructure Reliability & Quality

World's most comprehensive and broadly adopted cloud platform, pioneering cloud computing and continuous innovation.
$140,000 - $220,000
DevOps
Senior Software Engineer
In-Person
5,000+ Employees
8+ years of experience
Enterprise SaaS · Cloud

Description For Sr. Hardware Reliability Engineer, Infrastructure Reliability & Quality

AWS Infrastructure Services is seeking a Senior Hardware Reliability Engineer to join their team responsible for keeping the cloud running. This role focuses on proactively driving reliability risk identification, assessment, and mitigation for datacenter infrastructure equipment. You'll work with critical systems including Air Handling Units, Generators, Transformers, and UPS systems.

The position requires expertise in Physics-of-Failure based approaches, developing both analytical and empirical methods for quality/reliability risk assessment during product lifecycle stages. You'll analyze thermal, electrical, chemical, and mechanical stresses to identify product weaknesses and drive improvements in datacenter availability.

As part of AWS Infrastructure Services, you'll join a diverse team of software, hardware, and network engineers working on the most challenging problems in cloud infrastructure. The role involves close collaboration with internal teams and external partners, including suppliers, to drive product specifications and risk management.

Key responsibilities include developing datacenter system level reliability models, driving DFR methodology, overseeing equipment testing, and conducting root cause analysis of field failures. You'll also provide strategic input on maintenance schedules and equipment replacement based on reliability data.

AWS offers a collaborative environment where innovation is constant and customer focus is paramount. The company provides extensive career development resources, mentorship opportunities, and promotes work-life harmony. You'll be part of an inclusive culture that values diverse experiences and perspectives, with access to employee-led affinity groups and ongoing learning experiences.

This role offers the opportunity to directly impact the reliability and efficiency of AWS's global infrastructure while working with cutting-edge technology and talented professionals. The position requires travel within the US and internationally, offering exposure to various AWS facilities and partners worldwide.

Last updated 7 hours ago

Responsibilities For Sr. Hardware Reliability Engineer, Infrastructure Reliability & Quality

  • Drive DFR (Design for Reliability) methodology for New Product Designs
  • Drive reliability/quality qualification of third-party critical infrastructure equipment
  • Oversee factory and site testing of third-party equipment
  • Guide and support Root Cause Analysis of field failures
  • Make recommendations about AWS infrastructure maintenance and equipment replacement
  • Analyze internal reliability data and create metrics
  • Develop end of life strategy for critical infra equipment
  • Support DFMEAs as needed

Requirements For Sr. Hardware Reliability Engineer, Infrastructure Reliability & Quality

Linux
Kubernetes
  • Bachelor's or Master's degree in Reliability Engineering, Physics, Electrical, Mechanical or Materials Engineering
  • 8+ years of Reliability Engineering work experience in high reliability industry
  • 3+ years experience with failure analysis activities and root cause analysis
  • 3+ years experience with accelerated life testing, stress analysis and finite element analysis
  • Strong skill-set in problem analysis and solving
  • Communication and vendor management skills
  • Ability to travel within US and internationally

Benefits For Sr. Hardware Reliability Engineer, Infrastructure Reliability & Quality

Medical Insurance
Dental Insurance
Vision Insurance
  • Work-life harmony
  • Career development resources
  • Knowledge-sharing opportunities
  • Mentorship programs
  • Inclusive culture
  • Employee-led affinity groups

Interested in this job?

Jobs Related To Amazon Sr. Hardware Reliability Engineer, Infrastructure Reliability & Quality

Automation Solutions Engineer, Reliability and Automation Engineering Team (RAE)

Senior Automation Solutions Engineer role at Amazon, focusing on material handling systems optimization and technical leadership in fulfillment operations.

Senior Systems Development Engineer - Tools and Services, Device OS

Senior Systems Development Engineer role at Amazon working on developer tools and services for Fire Tablets, Fire TV, and Echo devices.

Software Dev Engineer - DevOps, Device OS

Senior DevOps Engineer role at Amazon Lab126 building scalable self-service tools for software deployment with competitive salary and benefits.

Network Development Engineer, BERE Operations

Senior Network Development Engineer role at AWS focusing on automation and scaling of global network infrastructure.

Product Support Engineer - AR Hardware and Controls, Amazon Robotics Technical Support (ARTS)

Senior Product Support Engineer role at Amazon Robotics, focusing on hardware and controls systems support for automated fulfillment centers.