Site Reliability Engineering (SRE) at Google Cloud combines software and systems engineering to build and maintain large-scale distributed systems. As an SRE, you'll be responsible for ensuring the reliability and uptime of Google Cloud's services, both internal and customer-facing systems. The role involves complex challenges of scale unique to Google Cloud, requiring expertise in coding, algorithms, complexity analysis, and large-scale system design.
The position offers opportunities to work on meaningful projects in a blame-free environment that values diversity, intellectual curiosity, and problem-solving. You'll be part of the Technical Infrastructure team, responsible for developing and maintaining data centers and building next-generation Google platforms. The team takes pride in being the engineers' engineers, focusing on keeping networks running optimally for the best user experience.
The role combines system design, software development, and operational excellence. You'll be involved in the entire service lifecycle, from initial design to deployment and refinement. Key responsibilities include capacity planning, launch reviews, monitoring system health, and implementing automation for sustainable scaling. The team culture promotes self-direction while providing support and mentorship for continuous learning and growth.
Working at Google offers competitive compensation including base salary, bonus, equity, and comprehensive benefits. The company is committed to building a representative workforce and fostering a culture of belonging, providing equal employment opportunities regardless of background. Join a team that values innovation, technical excellence, and collaborative problem-solving in maintaining Google's world-class infrastructure.