Google Cloud is seeking a Software Engineer III for Site Reliability Engineering (SRE) to join their team. This role combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. As an SRE, you'll ensure that Google Cloud's services—both internally critical and externally-visible systems—have reliability and uptime appropriate to customer needs, while maintaining a fast rate of improvement.
Key responsibilities include:
The ideal candidate will have a Bachelor's degree in Computer Science or related field (or equivalent practical experience) and at least 2 years of experience with data structures/algorithms and software development. Preferred qualifications include experience in distributed systems, storage, or networking, expertise in designing and troubleshooting large-scale systems, and strong problem-solving and communication skills.
This role offers the opportunity to work on unique challenges of scale within Google Cloud, using your expertise in coding, algorithms, complexity analysis, and large-scale system design. You'll be part of a diverse and intellectually curious team that values problem-solving and openness. The SRE culture promotes self-direction, collaboration, and risk-taking in a blame-free environment, while also providing support and mentorship for continuous learning and growth.
Join Google Cloud's SRE team to manage complex challenges, optimize existing systems, build infrastructure, and automate processes, all while working in a supportive and innovative environment at the forefront of cloud technology.