Site Reliability Engineering (SRE) at Google is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. As an SRE, you'll ensure Google's services have appropriate reliability and uptime while maintaining performance and capacity. The role focuses on optimizing existing systems, building infrastructure, and automating operations.
SRE at Google emphasizes limiting operational work, conducting blameless postmortems, and proactively identifying potential outages. The team culture values diversity, intellectual curiosity, and problem-solving in a blame-free environment. You'll work with various tools and approaches to solve a broad spectrum of problems.
The position offers opportunities to collaborate with people from diverse backgrounds and perspectives. Google encourages self-direction on meaningful projects while providing support and mentorship for growth. You'll be part of a team that builds creative engineering solutions to operations problems and maintains Google's internally critical and externally-visible systems.
This role combines traditional software development with systems engineering, requiring both coding skills and systems knowledge. You'll work on optimizing existing systems, building infrastructure, and creating automation solutions. The position offers exposure to large-scale distributed systems and the chance to impact Google's global infrastructure.
As part of Google's commitment to diversity and inclusion, they welcome Indigenous applicants and have a vision of empowerment and equitable opportunity for all Aboriginal and Torres Strait Islander peoples.