Site Reliability Engineering (SRE) at Google combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. As a System Engineering Manager for Google Play's SRE team, you'll lead a team ensuring Google's services maintain reliability and appropriate uptime while monitoring system capacity and performance. The role involves managing complex challenges unique to Google's scale, utilizing expertise in coding, algorithms, and large-scale system design.
The position requires strong leadership skills to guide a team of experienced engineers, focusing on optimizing existing systems, building infrastructure, and implementing automation. You'll be responsible for maintaining service availability, managing cross-continental on-call rotations, and driving technical projects to completion. The role combines technical expertise with people management, requiring both strategic thinking and hands-on technical guidance.
Google's SRE culture emphasizes diversity, intellectual curiosity, and problem-solving in a blame-free environment. You'll work with people from various backgrounds and perspectives, encouraging collaboration and innovation. The Technical Infrastructure team plays a crucial role in maintaining Google's architecture, from data centers to next-generation platforms, ensuring users have the best possible experience.
This is an opportunity to lead and grow a team while working on some of the most complex and impactful systems in technology. You'll be responsible for both the technical success of critical infrastructure and the professional development of your team members, making this role perfect for those who combine technical excellence with strong leadership capabilities.