Hardware Reliability Engineer, Infrastructure Reliability & Quality

AWS is the world's most comprehensive and broadly adopted cloud platform, pioneering cloud computing and continuously innovating.
DevOps
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
Enterprise SaaS · Cloud

Description For Hardware Reliability Engineer, Infrastructure Reliability & Quality

AWS Infrastructure Services is seeking a Hardware Reliability Engineer to join their team responsible for keeping the cloud running smoothly. This role combines technical expertise with business acumen, focusing on maintaining and improving the reliability of AWS's global infrastructure.

As an Infrastructure Reliability Engineer, you'll be at the forefront of ensuring AWS datacenter infrastructure and security equipment operates at peak efficiency. You'll work with cutting-edge technology, analyzing and mitigating reliability risks for critical systems including cameras, media destruction devices, access control systems, and various power and cooling equipment.

The role requires a unique blend of technical knowledge and analytical skills. You'll use physics-based approaches to evaluate product reliability, conduct lifecycle environmental assessments, and develop system-level reliability models. Your work will directly impact AWS's ability to provide continuous, reliable service to customers worldwide.

You'll join a diverse team of professionals, including software engineers, hardware specialists, and security experts. The collaborative environment encourages knowledge sharing and professional growth, with access to mentorship and career development resources. AWS values work-life harmony and maintains an inclusive culture that welcomes diverse perspectives and bold ideas.

Key responsibilities include driving reliability risk identification, performing root cause analysis of critical failures, and working with both internal teams and external vendors to implement improvements. You'll need strong analytical skills, excellent communication abilities, and a proven track record in reliability engineering.

This is an excellent opportunity for someone who wants to impact cloud computing infrastructure at a global scale. You'll be part of AWS's mission to deliver the highest standards of safety and security while providing seemingly infinite capacity at the lowest possible cost for customers.

The ideal candidate will have at least 5 years of relevant experience and a strong educational background in reliability engineering or related fields. You'll need to be comfortable with both technical analysis and business negotiations, as you'll interface with various stakeholders to drive continuous improvement in datacenter availability.

Last updated 3 months ago

Responsibilities For Hardware Reliability Engineer, Infrastructure Reliability & Quality

  • Drive reliability risk identification, assessment and mitigation for datacenter infrastructure & security equipment
  • Perform root cause analysis of critical equipment failures
  • Drive continuous improvements to improve datacenter availability & security
  • Work with internal and external partners including suppliers
  • Develop datacenter system level reliability model
  • Monitor product performance and drive corrective actions
  • Conduct vendor auditing and quarterly review process
  • Drive AWS application-specific requirements in lifecycle environmental and operational stress analysis

Requirements For Hardware Reliability Engineer, Infrastructure Reliability & Quality

Linux
Kubernetes
  • Bachelor's or Master's degree in Reliability Engineering, Physics, Electrical, Mechanical or Materials Engineering or related field
  • 5+ years of Reliability Engineering work experience in high reliability industry
  • 5+ years of experience with failure analysis activities and root cause analysis
  • 5+ years of experience with accelerated life testing, stress analysis and finite element analysis
  • Knowledge of statistical techniques and models
  • Ability to travel within US and internationally

Benefits For Hardware Reliability Engineer, Infrastructure Reliability & Quality

Medical Insurance
Dental Insurance
Vision Insurance
  • Work-life harmony
  • Career development opportunities
  • Mentorship programs
  • Inclusive culture
  • Employee-led affinity groups

Interested in this job?

Jobs Related To Amazon Hardware Reliability Engineer, Infrastructure Reliability & Quality

Product Launch - Customer Solution Manager - Tech

Lead the launch and implementation of advanced robotics solutions at Amazon Robotics as a Product Launch Customer Solution Manager, managing cross-functional teams and driving technological innovation.

Data Center Chief Engineer

AWS Data Center Chief Engineer role overseeing critical infrastructure operations and maintenance in Sterling, VA

Global field operations field engineer, EMEA GFO Eng team

Senior DevOps Engineer position at AWS focusing on data center operations and infrastructure management across EMEA region.

Operations Engineer, AMZL Start-Up Execution

Senior Operations Engineer role at Amazon Logistics, leading cross-functional projects in sortation and distribution solutions with competitive compensation and benefits.

Support Engineer III

Senior Support Engineer role at Amazon focusing on systems analysis, maintenance, and optimization of complex technical infrastructure.