Senior Systems Engineer, Site Reliability Engineering

Google is a global technology leader that develops innovative products and services used by billions of people worldwide.
Site Reliability
Senior Software Engineer
In-Person
5+ years of experience
Enterprise SaaS

Description For Senior Systems Engineer, Site Reliability Engineering

Google's Site Reliability Engineering (SRE) team combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. As a Senior Systems Engineer in SRE, you'll ensure Google Cloud's services maintain reliability and appropriate uptime while monitoring system capacity and performance. The role focuses on optimizing existing systems, building infrastructure, and automation. You'll tackle unique scaling challenges specific to Google Cloud, applying expertise in coding, algorithms, and large-scale system design. The team values diversity, intellectual curiosity, and problem-solving in a blame-free environment. Working in Technical Infrastructure, you'll help maintain Google's data centers and platforms, keeping networks running optimally for the best user experience. The role offers opportunities for self-direction on meaningful projects while providing support and mentorship for professional growth. You'll work with a diverse team of professionals from various backgrounds, collaborating on critical systems that power Google's extensive product portfolio.

Last updated a day ago

Responsibilities For Senior Systems Engineer, Site Reliability Engineering

  • Improve the whole lifecycle of services from inception and design, through deployment, operation, and refinement
  • Provide guidance to other team members on managing availability and performance of mission critical services
  • Maintain services by measuring and monitoring availability, latency, and overall system health
  • Lead sustainable incident response and blameless postmortems
  • Scale systems sustainably through automation
  • Manage support services before they go live through system design consulting
  • Develop software platforms and frameworks, capacity planning, and launch reviews

Requirements For Senior Systems Engineer, Site Reliability Engineering

Linux
  • Bachelor's degree in Computer Science, a related field, or equivalent practical experience
  • 5 years of experience with programming in one or more programming languages
  • 3 years of experience designing, analyzing, and troubleshooting distributed systems
  • Experience with Unix/Linux systems internals and administration
  • 2 years of experience leading projects
  • Experience in computing, distributed systems, storage, or networking
  • Expertise in designing, analyzing, and troubleshooting large-scale distributed systems
  • Ability to debug, optimize code, and automate routine tasks
  • Systematic problem-solving approach with effective communication skills

Interested in this job?

Jobs Related To Google Senior Systems Engineer, Site Reliability Engineering

Senior Software Developer, Site Reliability Engineering, Google Cloud

Senior SRE role at Google Cloud focusing on building and maintaining large-scale distributed systems with competitive compensation and comprehensive benefits.

Senior Software Engineer, Site Reliability Engineering, Google Cloud

Senior SRE position at Google Cloud focusing on building and maintaining large-scale distributed systems, requiring 5 years of software development experience.

Senior Software Engineer, Site Reliability Engineering, Google Cloud

Senior SRE position at Google Cloud focusing on building and maintaining large-scale distributed systems with emphasis on reliability, automation, and system optimization.

Senior Software Engineer, Site Reliability Engineering, Google Cloud

Senior SRE position at Google Cloud focusing on building and maintaining large-scale distributed systems with emphasis on reliability, automation, and system optimization.

Senior Software Engineer, Site Reliability Engineering, Data Cloud

Senior SRE position at Google focusing on AI-powered infrastructure and tools for cloud services reliability and optimization.