Site Reliability Development at Google Cloud combines software and systems development to build and run large-scale, massively distributed, fault-tolerant systems. As a Site Reliability Developer, you'll ensure Google Cloud's services maintain reliability and uptime while focusing on system optimization, infrastructure development, and automation. The role offers unique challenges of scale specific to Google Cloud, requiring expertise in coding, algorithms, complexity analysis, and large-scale system design.
The team embraces a culture of diversity, intellectual curiosity, and problem-solving in a blame-free environment. Google encourages collaboration, big thinking, and risk-taking while providing support and mentorship for professional growth. The organization brings together individuals with diverse backgrounds and perspectives, promoting self-direction on meaningful projects.
You'll be responsible for managing project priorities, deadlines, and deliverables while designing, developing, testing, deploying, maintaining, and enhancing software solutions. The role involves working with large-scale distributed systems, ensuring optimal capacity and performance of Google Cloud's services, and contributing to both internally critical and externally-visible systems.
This position offers the opportunity to work with cutting-edge technology while being part of a team that values continuous learning and innovation. You'll collaborate with talented engineers, participate in design reviews, and contribute to the evolution of Google Cloud's infrastructure. The role combines technical expertise with system reliability, making it ideal for those passionate about both software development and systems engineering.