Software Engineering Manager II, Site Reliability Engineering

Google is a global technology company that builds and maintains large-scale distributed systems and infrastructure powering their product portfolio.
Site Reliability
Staff Software Engineer
In-Person
5,000+ Employees
8+ years of experience
Enterprise SaaS

Description For Software Engineering Manager II, Site Reliability Engineering

Google's Site Reliability Engineering (SRE) team is seeking a Software Engineering Manager II to lead and manage a team of engineers responsible for building and maintaining large-scale, distributed systems. This role combines software and systems engineering to ensure Google's services maintain optimal reliability and performance.

As an Engineering Manager in the SRE organization, you'll be at the forefront of managing complex challenges unique to Google's scale. You'll lead a team responsible for the uptime and performance of critical systems, while focusing on automation, optimization, and infrastructure development. The role requires a strong technical background with 8 years of experience in data structures and algorithms, along with 5 years of software development experience and 3 years of people management.

The position offers the opportunity to work with Google's Technical Infrastructure team, where you'll be responsible for developing and maintaining data centers and building next-generation Google platforms. You'll lead projects that directly impact the user experience of Google's services worldwide, managing on-call rotations across continents and implementing automation to prevent problem recurrence.

SRE at Google promotes a culture of diversity, intellectual curiosity, and problem-solving in a blame-free environment. The team brings together individuals with varied backgrounds and perspectives, encouraging collaboration and innovation. You'll have the chance to work on meaningful projects while receiving support and mentorship for continuous learning and growth.

The ideal candidate will combine technical expertise in distributed systems with strong leadership abilities. You'll need to demonstrate expertise in designing and troubleshooting large-scale systems, possess excellent problem-solving skills, and have the ability to communicate effectively. This role offers the unique opportunity to be at the intersection of software development and systems engineering, making a significant impact on Google's global infrastructure.

Last updated 3 days ago

Responsibilities For Software Engineering Manager II, Site Reliability Engineering

  • Lead a team of Software/Systems Engineers on projects for users and be directly responsible for uptime
  • Own end-to-end availability and performance of key services and build automation to prevent problem recurrence
  • Automate response to all non-exceptional service conditions
  • Lead by example, mentor the team and establish credibility through quality technical execution
  • Manage on-call rotations across continents, using a follow-the-sun model
  • Design, write and deliver software to improve the availability, scalability, latency and efficiency of Google's services

Requirements For Software Engineering Manager II, Site Reliability Engineering

Linux
Kubernetes
  • Bachelor's degree in Computer Science, a related field, or equivalent practical experience
  • 8 years of experience with data structures or algorithms
  • 5 years of experience with software development in one or more programming languages
  • 3 years of people management experience
  • Experience designing, analyzing, and troubleshooting distributed systems
  • Experience working in computing, distributed systems, storage, or networking
  • Expertise in designing, analyzing, and troubleshooting large-scale distributed systems
  • Ability to debug, optimize code, and to automate routine tasks
  • Systematic problem-solving approach
  • Effective verbal and written communication skills

Benefits For Software Engineering Manager II, Site Reliability Engineering

Medical Insurance
Dental Insurance
Vision Insurance
  • Equal opportunity employer
  • Accommodation for special needs

Interested in this job?

Jobs Related To Google Software Engineering Manager II, Site Reliability Engineering

Site Reliability Manager, Core Enterprise Systems

Lead a team of SRE engineers at Google, managing enterprise services and driving reliability improvements across critical internal systems.

Technical Program Manager III, Site Reliability, Storage

Technical Program Manager III position at Google, leading Storage Site Reliability Engineering initiatives and cross-functional programs.

Software Engineering Manager II, Site Reliability Engineering, Google Cloud

Lead Site Reliability Engineering team at Google Cloud, managing distributed systems and ensuring service reliability at global scale.

Engineering Manager II, AdsML SRE

Lead Google's AdsML SRE team in Dublin, managing distributed systems and engineering teams while ensuring service reliability and optimization.

Software Engineering Manager II, Site Reliability Engineering

Lead Google's Site Reliability Engineering team, managing distributed systems and ensuring service reliability while driving technical innovation and team development.