Site Reliability Engineering (SRE) at Google combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. As an Engineering Manager in the SRE team, you'll lead a team responsible for ensuring Google's services maintain reliability and appropriate uptime while managing complex challenges unique to Google's scale. The role involves optimizing existing systems, building infrastructure, and automating processes.
The position requires strong technical leadership skills to manage and mentor a team of Software/Systems Engineers, while being directly responsible for service uptime and performance. You'll work on critical infrastructure that powers Google's vast product portfolio, from data centers to next-generation platforms.
The role offers the opportunity to work in a culture that values diversity, intellectual curiosity, and problem-solving. You'll be part of an organization that brings together people with varied backgrounds and perspectives, encouraging collaboration and innovation in a blame-free environment. The position provides both the autonomy to work on meaningful projects and the support structure needed for professional growth.
Key aspects include managing on-call rotations across continents, building automation to prevent problem recurrence, and leading technical projects that improve service availability, scalability, and efficiency. The role combines technical expertise with people management, requiring both strong engineering skills and leadership capabilities.
This is an excellent opportunity for experienced engineers looking to step into a leadership role while remaining technically hands-on, working with cutting-edge technology at massive scale.