Join Amazon's Infrastructure Reliability Engineering team, where we build scalable solutions ensuring the reliability of Amazon's critical systems. Our team develops and operates tools for detecting and preventing outages across global infrastructure, focusing on service-to-service communications, network traffic, and event correlation. We're seeking talented Software Development Engineers to innovate in distributed tracing, network analysis, and event correlation at Amazon scale.
The role offers the opportunity to work on greenfield programs while collaborating with engineers across Amazon. We foster a culture of continuous learning and growth, empowering team members to expand their skills. You'll be working with core technologies including Java, Python, Linux, and AWS services, creating maintainable, high-quality software with robust automated testing.
As part of the Ops Tech Solutions team, you'll be responsible for building intelligent and real-time insights into service behavior across hundreds of Amazon's critical fulfillment and robotics services. Your work will directly impact millions of customers by ensuring high availability and maintaining Amazon's Customer Promise.
The position offers comprehensive benefits including medical, dental, and vision coverage, parental leave options, PTO, and a 401(k) plan. You'll be joining a dynamic team that values innovation, problem-solving, and maintaining high standards for software quality and reliability. This is an excellent opportunity for those passionate about developing robust, highly available systems at tremendous scale.