Google's Site Reliability Engineering (SRE) team is seeking a Senior Systems Engineer to join their technical infrastructure organization. This role combines software and systems engineering to build and maintain Google Cloud's large-scale, distributed systems. As an SRE, you'll be responsible for ensuring the reliability and uptime of both internal and external systems while focusing on performance optimization and automation.
The position requires a strong background in distributed systems, with at least 5 years of programming experience and 3 years of systems/networking expertise. You'll lead projects, provide technical guidance to team members, and play a crucial role in incident response and system optimization.
The role offers unique challenges of working at Google's scale, where you'll apply your expertise in coding, algorithms, and system design. You'll be part of a diverse and collaborative culture that encourages intellectual curiosity and problem-solving in a blame-free environment. The team promotes self-direction while providing support and mentorship for continuous learning and growth.
Key responsibilities include improving service lifecycles, maintaining system health through monitoring and metrics, leading incident response, and driving automation initiatives. You'll also contribute to system design consulting and capacity planning for new services.
This is an excellent opportunity for experienced engineers who want to work on some of the world's largest distributed systems, contribute to Google's technical infrastructure, and lead technical initiatives while working with a diverse and talented team. The role offers the chance to solve complex challenges at scale while helping to shape the future of Google's infrastructure.