Google's Site Reliability Engineering (SRE) team is seeking a Senior Software Engineer to join their Data Cloud division. This role combines software and systems engineering to build and maintain large-scale, distributed systems that power Google Cloud's services. The position offers a unique opportunity to work on complex scalability challenges while leveraging AI and machine learning capabilities.
The role involves leading the development of core infrastructure and tools that enable SRE teams to harness AI power for system behavior insights. You'll be responsible for designing and implementing AI features that enhance engineering efficiency and customer satisfaction, such as incident-support case matching, similarity search, and bug analysis.
As part of Google's Technical Infrastructure team, you'll work behind the scenes to maintain and develop data centers and next-generation Google platforms. The team takes pride in being "engineers' engineers" and focuses on keeping networks running optimally for the best user experience.
The position offers exposure to cutting-edge technology and the chance to work with diverse, intellectually curious professionals in a blame-free environment that encourages collaboration and risk-taking. Google promotes self-direction on meaningful projects while providing support and mentorship for continuous learning and growth.
This role is perfect for candidates with strong software development experience and expertise in distributed systems who want to impact global-scale infrastructure. You'll join a culture that values diversity, problem-solving, and openness, working on projects that directly influence Google's product portfolio reliability and performance.
The ideal candidate will have experience in software engineering and machine learning, with a proven track record in designing and troubleshooting large-scale distributed systems. Strong communication skills and a systematic approach to problem-solving are essential, as you'll be collaborating with teams across Google to ensure service reliability and continuous improvement.