Site Reliability Engineering (SRE) at Google combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. As an SRE, you'll ensure Google's services maintain reliability and appropriate uptime while managing system capacity and performance. The role focuses on optimizing existing systems, building infrastructure, and automating processes.
The Technical Infrastructure team is responsible for the architecture that powers Google's product portfolio. From developing and maintaining data centers to building next-generation Google platforms, this team makes Google's services possible. The role specifically focuses on database reliability, working with Google Cloud Platform's Spanner database service.
You'll be joining a culture that values diversity, intellectual curiosity, and problem-solving in a blame-free environment. The team brings together people with varied backgrounds and perspectives, encouraging collaboration and big-picture thinking. You'll have the opportunity to work on meaningful projects while receiving support and mentorship for professional growth.
Key aspects of the role include collaborating with Cloud Support, improving system reliability, participating in on-call rotations, and managing GCP Spanner allocations. The position requires strong technical skills in programming and systems, combined with the ability to work effectively in a team environment. This is an excellent opportunity for someone passionate about large-scale systems and database reliability who wants to impact billions of users worldwide.