Site Reliability Engineering (SRE) at Google combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. As an SRE, you'll ensure Google Cloud's services maintain reliability and appropriate uptime while focusing on continuous improvement. The role involves optimizing existing systems, building infrastructure, and implementing automation solutions. You'll manage unique scale tests for Google Cloud while applying expertise in coding, algorithms, analysis, and large-scale system design.
The SRE team values diversity, intellectual curiosity, and problem-solving in a blame-free environment. You'll join a collaborative culture that brings together people with diverse backgrounds and perspectives. The role offers self-direction on meaningful projects while providing support and mentorship for professional growth.
As a Software Engineer III in SRE, you'll manage project priorities, deadlines, and deliverables while designing, developing, testing, deploying, maintaining, and enhancing software solutions. You'll work with cutting-edge distributed systems, contribute to system reliability, and help shape the future of Google Cloud's infrastructure.
The position requires strong technical skills, particularly in distributed systems and software development. You'll collaborate with talented engineers, participate in design reviews, and have the opportunity to make a significant impact on Google's infrastructure. The role offers exposure to some of the world's largest computing systems while working in a supportive, growth-oriented environment.