Site Reliability Engineering (SRE) at Google combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. As an Engineering Manager in the SRE team, you'll lead a team responsible for ensuring Google's services maintain reliability and appropriate uptime while optimizing performance and capacity. The role involves managing complex challenges unique to Google's scale, utilizing expertise in coding, algorithms, and large-scale system design.
The position is part of Google's Technical Infrastructure team, which is fundamental to keeping Google's product portfolio running. The team develops and maintains data centers, builds next-generation Google platforms, and ensures networks operate at peak performance. The culture emphasizes diversity, intellectual curiosity, and problem-solving in a blame-free environment.
You'll be leading a team of Software/Systems Engineers, providing technical leadership on key projects, and being directly responsible for service uptime. The role involves end-to-end ownership of service availability and performance, building automation to prevent problems, and managing on-call rotations across continents. This position offers the opportunity to work on meaningful projects while receiving support and mentorship for continuous learning and growth.
Google promotes a diverse and inclusive workplace, offering equal employment opportunities regardless of background. The role requires strong technical expertise combined with leadership skills, making it ideal for experienced engineers who want to impact Google's infrastructure at a global scale.