Software Development Engineer, AWS Incident Tooling & Response

World's most comprehensive and broadly adopted cloud platform, pioneering cloud computing and continuous innovation.
Backend
Mid-Level Software Engineer
In-Person
5,000+ Employees
3+ years of experience
Enterprise SaaS · Cloud

Description For Software Development Engineer, AWS Incident Tooling & Response

Amazon Web Services (AWS) is seeking a Software Development Engineer to join their Incident Response Systems team, a crucial component of AWS's infrastructure services. This role focuses on building and maintaining systems that automate fault containment, problem diagnosis, and issue resolution across AWS's distributed architectures. The position offers an opportunity to work with AWS's largest products and directly impact the stability and reliability of the world's leading cloud platform.

The role involves designing and implementing automated systems that analyze metrics and dependency data to determine root causes of issues without human intervention. You'll work closely with teams across AWS to drive adoption of team-built software and influence system development practices. Within your first year, you'll collaborate with senior technical leaders, implement new systems, and investigate historic customer-impacting events to prevent future occurrences.

AWS Infrastructure Services manages all AWS global infrastructure, ensuring continuous operation of data centers, servers, storage, networking, and related equipment. The team tackles complex supply chain challenges and maintains the highest standards for safety and security while optimizing capacity and cost efficiency for customers.

The position offers a diverse and inclusive work environment, with opportunities to participate in employee-led affinity groups and ongoing learning experiences. AWS values varied experiences and backgrounds, encouraging applications from candidates with non-traditional career paths. The team comprises engineers with extensive incident response experience and traditional software engineering backgrounds, creating a vibrant and creative environment focused on building high-quality solutions for customer problems.

This role represents an excellent opportunity for a software engineer passionate about large-scale systems, automation, and maintaining critical infrastructure. You'll be at the forefront of ensuring AWS remains the most reliable cloud platform while working with cutting-edge technologies and influential technical leaders.

Last updated 7 minutes ago

Responsibilities For Software Development Engineer, AWS Incident Tooling & Response

  • Write well-tested, maintainable code
  • Design, contribute to, and maintain systems which solve customer problems
  • Work with team-mates to improve code quality, system architecture and team processes
  • Learn about incident management processes
  • Review code and create documentation
  • Respond to operational issues in the team's systems
  • Contribute to the long term direction for the team

Requirements For Software Development Engineer, AWS Incident Tooling & Response

Java
Python
JavaScript
  • Experience (non-internship) in professional software development
  • Experience designing or architecting new and existing systems
  • Experience programming with at least one software programming language
  • Bachelor's degree in computer science or equivalent (preferred)
  • Experience with full software development life cycle (preferred)

Benefits For Software Development Engineer, AWS Incident Tooling & Response

Medical Insurance
Parental Leave
Education Budget
  • Work-life harmony
  • Mentorship and career growth opportunities
  • Inclusive team culture
  • Employee-led affinity groups
  • Knowledge-sharing resources
  • Career advancement resources

Interested in this job?

Jobs Related To Amazon Software Development Engineer, AWS Incident Tooling & Response

Software Development Engineer – EC2, Managed Fleets

Software Development Engineer role at AWS Managed Fleets team, building automation systems to manage millions of hosts using Java, TypeScript, React, and Python.

Mission Operations Engineer, Project Kuiper - Mission Operations, Ground Software

Mission Operations Engineer position at Amazon's Project Kuiper, focusing on satellite constellation management and ground software systems in Redmond, WA.

Software Developer Engineer II- TEST, Alexa Smart Home

Software Developer Engineer II position at Amazon's Alexa Smart Home team, focusing on test automation framework development and quality assurance for smart home integration.

Software Development Engineer, Amazon Publisher Cloud

Full Stack Engineer role at Amazon's Advertising Technology team, building scalable ad serving systems and infrastructure handling billions of daily queries.

Software Development Engineer II, Selling Partner Communities

Software Development Engineer II position at Amazon's Selling Partner Communities team, building scalable solutions for seller forums and news platforms.