Software Engineering Manager II, Site Reliability Engineering

Global technology leader specializing in internet-related services and products.
Site Reliability
Staff Software Engineer
In-Person
5,000+ Employees
8+ years of experience
Enterprise SaaS

Description For Software Engineering Manager II, Site Reliability Engineering

Site Reliability Engineering (SRE) at Google combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. As an Engineering Manager in the SRE team, you'll lead a team responsible for ensuring Google's services maintain reliability and appropriate uptime while optimizing performance and capacity. The role involves managing complex challenges unique to Google's scale, utilizing expertise in coding, algorithms, and large-scale system design.

The position is part of Google's Technical Infrastructure team, which is fundamental to keeping Google's product portfolio running. The team develops and maintains data centers, builds next-generation Google platforms, and ensures networks operate at peak performance. The culture emphasizes diversity, intellectual curiosity, and problem-solving in a blame-free environment.

You'll be leading a team of Software/Systems Engineers, providing technical leadership on key projects, and being directly responsible for service uptime. The role involves end-to-end ownership of service availability and performance, building automation to prevent problems, and managing on-call rotations across continents. This position offers the opportunity to work on meaningful projects while receiving support and mentorship for continuous learning and growth.

Google promotes a diverse and inclusive workplace, offering equal employment opportunities regardless of background. The role requires strong technical expertise combined with leadership skills, making it ideal for experienced engineers who want to impact Google's infrastructure at a global scale.

Last updated 4 days ago

Responsibilities For Software Engineering Manager II, Site Reliability Engineering

  • Lead a team of Software/Systems Engineers on projects for users and be directly responsible for uptime
  • Own end-to-end availability and performance of key services and build automation to prevent problem recurrence
  • Automate response to all non-exceptional service conditions
  • Lead by example, mentor the team and establish credibility through quality technical execution
  • Manage on-call rotations across continents, using a follow-the-sun model
  • Design, write and deliver software to improve the availability, scalability, latency and efficiency of Google's services

Requirements For Software Engineering Manager II, Site Reliability Engineering

Linux
Kubernetes
  • Bachelor's degree in Computer Science, a related field, or equivalent practical experience
  • 8 years of experience with data structures or algorithms
  • 5 years of experience with software development in one or more programming languages
  • 3 years of people management experience
  • Experience designing, analyzing, and troubleshooting distributed systems
  • Experience working in computing, distributed systems, storage, or networking
  • Expertise in designing, analyzing, and troubleshooting large-scale distributed systems
  • Ability to debug, optimize code, and to automate routine tasks
  • Systematic problem-solving approach, coupled with effective communication skills

Interested in this job?

Jobs Related To Google Software Engineering Manager II, Site Reliability Engineering

Senior Technical Program Manager I, Site Reliability Engineering, Google Cloud Platforms

Senior Technical Program Manager role at Google Cloud, focusing on Site Reliability Engineering, offering competitive compensation and the opportunity to lead complex technical projects.

Technical Program Manager III, SRE, Cloud Infrastructure

Technical Program Manager III position at Google, focusing on SRE and Cloud Infrastructure, requiring 5 years of experience and offering $156,000-$229,000 base salary plus benefits.

Site Reliability Manager, Core Enterprise Systems

Lead a team of Site Reliability Engineers at Google, managing enterprise services and driving engineering excellence in system reliability, automation, and service delivery.

Software Engineering Manager, Site Reliability Engineering, Platform, Devices

Lead Site Reliability Engineering team at Google, managing distributed systems and infrastructure while mentoring engineers and ensuring service reliability.

Software Engineering Manager II, Site Reliability Engineering

Lead Site Reliability Engineering team at Google, managing distributed systems and infrastructure while ensuring service reliability and performance.