Senior Systems Engineer, Site Reliability Engineering

Google is a global technology company that builds and maintains large-scale, distributed systems and infrastructure powering their product portfolio.
Site Reliability
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
Enterprise SaaS · AI

Description For Senior Systems Engineer, Site Reliability Engineering

Google's Site Reliability Engineering (SRE) team is seeking a Senior Systems Engineer to join their technical infrastructure organization. This role combines software and systems engineering to build and maintain Google Cloud's large-scale, distributed systems. As an SRE, you'll be responsible for ensuring the reliability and uptime of both internal and external systems while focusing on performance optimization and automation.

The position requires a strong background in distributed systems, with at least 5 years of programming experience and 3 years of systems/networking expertise. You'll lead projects, provide technical guidance to team members, and play a crucial role in incident response and system optimization.

The role offers unique challenges of working at Google's scale, where you'll apply your expertise in coding, algorithms, and system design. You'll be part of a diverse and collaborative culture that encourages intellectual curiosity and problem-solving in a blame-free environment. The team promotes self-direction while providing support and mentorship for continuous learning and growth.

Key responsibilities include improving service lifecycles, maintaining system health through monitoring and metrics, leading incident response, and driving automation initiatives. You'll also contribute to system design consulting and capacity planning for new services.

This is an excellent opportunity for experienced engineers who want to work on some of the world's largest distributed systems, contribute to Google's technical infrastructure, and lead technical initiatives while working with a diverse and talented team. The role offers the chance to solve complex challenges at scale while helping to shape the future of Google's infrastructure.

Last updated 3 months ago

Responsibilities For Senior Systems Engineer, Site Reliability Engineering

  • Improve the whole lifecycle of services from inception and design, through deployment, operation, and refinement
  • Provide guidance to other team members on managing availability and performance of mission critical services
  • Maintain services by measuring and monitoring availability, latency, and overall system health
  • Lead sustainable incident response and blameless postmortems
  • Scale systems sustainably through automation
  • Manage support services before they go live through system design consulting, capacity planning, and launch reviews

Requirements For Senior Systems Engineer, Site Reliability Engineering

Linux
  • Bachelor's degree in Computer Science, a related field, or equivalent practical experience
  • 5 years of experience with programming in one or more programming languages
  • 3 years of experience designing, analyzing, and troubleshooting distributed systems
  • Experience with system administration or networking (TCP/IP, routing, network topologies)
  • 2 years of experience leading projects, and providing technical leadership
  • Experience working with incident response

Interested in this job?

Jobs Related To Google Senior Systems Engineer, Site Reliability Engineering

Senior Software Developer, Site Reliability Engineering, Google Cloud

Senior Site Reliability Engineering role at Google Cloud, focusing on building and maintaining large-scale distributed systems with competitive compensation and comprehensive benefits.

Senior Software Developer, Site Reliability Engineering, Google Cloud

Senior Software Developer role in Google's Site Reliability Engineering team, focusing on building and maintaining large-scale distributed systems with 5+ years of experience required.

Senior Software Engineer, SRE, Cloud Incident Response

Senior SRE position at Google focusing on Cloud Incident Response, system reliability, and distributed systems management.

Senior Software Engineer, Site Reliability Engineering

Senior SRE position at Google Bengaluru, focusing on enterprise applications reliability and distributed systems at scale.

Senior Software Engineer, Site Reliability Engineering, Google Play

Senior Site Reliability Engineer position at Google Play, focusing on maintaining and optimizing large-scale distributed systems and ensuring service reliability.