Google's Site Reliability Engineering (SRE) team is seeking a Senior Systems Engineer to join their technical infrastructure team. This role combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. As an SRE, you'll be responsible for ensuring Google Cloud's services maintain reliability and appropriate uptime while monitoring system capacity and performance.
The position requires extensive experience in distributed systems, with a focus on designing, analyzing, and troubleshooting large-scale infrastructure. You'll work on optimizing existing systems, building infrastructure, and creating automation solutions to eliminate manual work. The role offers unique challenges of scale specific to Google Cloud, where you'll apply your expertise in coding, algorithms, complexity analysis, and large-scale system design.
SRE's culture emphasizes diversity, intellectual curiosity, and problem-solving in a blame-free environment. The team brings together individuals with varied backgrounds and perspectives, encouraging collaboration and innovative thinking. You'll have the opportunity to work on meaningful projects with significant impact, while receiving support and mentorship for continuous learning and growth.
Key responsibilities include improving service lifecycles, providing technical guidance to team members, maintaining service reliability through monitoring and metrics, leading incident responses, and driving automation initiatives. You'll also be involved in system design consulting, capacity planning, and launch reviews for new services.
The ideal candidate will have at least 5 years of programming experience, strong knowledge of Unix/Linux systems, and proven experience with distributed systems. Leadership experience and excellent communication skills are essential, as you'll be guiding team members and collaborating across various technical teams.
Join Google's Technical Infrastructure team to help build and maintain the architecture that powers Google's vast product portfolio. You'll be part of a team that takes pride in being the engineers' engineers, focusing on creating robust, scalable solutions that ensure the best possible user experience.