Systems Engineer, Site Reliability Engineering

Google is a global technology leader that develops innovative products and services used by billions of people worldwide.
Site Reliability
Mid-Level Software Engineer
Contact Company
2+ years of experience
Enterprise SaaS · Cloud

Description For Systems Engineer, Site Reliability Engineering

Site Reliability Engineering (SRE) at Google combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. As an SRE, you'll ensure Google Cloud's services maintain reliability and appropriate uptime while monitoring system capacity and performance. The role involves optimizing existing systems, building infrastructure, and automating processes. You'll tackle unique scaling challenges specific to Google Cloud, applying expertise in coding, algorithms, and large-scale system design. The team values intellectual curiosity, problem-solving, and openness, bringing together diverse perspectives in a blame-free environment. You'll work on meaningful projects with support and mentorship for growth. Behind the scenes, you'll be part of the Technical Infrastructure team that maintains Google's data centers and platforms, keeping networks running optimally. This role offers opportunities to shape service lifecycle, provide technical guidance, and drive system improvements while working with cutting-edge distributed systems technology.

Last updated 4 days ago

Responsibilities For Systems Engineer, Site Reliability Engineering

  • Improve the whole lifecycle of services from inception and design, through deployment, operation, and refinement
  • Manage support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews
  • Provide guidance to other team members on managing availability and performance of mission services
  • Maintain services once they are live by measuring and monitoring availability, latency, and overall system health
  • Scale systems sustainably through mechanisms like automation and evolve systems by driving changes that improve reliability and velocity

Requirements For Systems Engineer, Site Reliability Engineering

Linux
Python
Go
Java
  • Bachelor's degree in Computer Science, a related field, or equivalent practical experience
  • 2 years of experience with programming in one or more programming languages
  • 2 years of experience with Unix/Linux systems internals and administrations or networking
  • Experience with working in computing, distributed systems, storage, or networking
  • Experience in designing, analyzing, and troubleshooting distributed systems
  • Ability to debug, optimize code, and automate routine tasks
  • Excellent problem-solving and communication skills

Interested in this job?

Jobs Related To Google Systems Engineer, Site Reliability Engineering

Software Developer III, Site Reliability Development, Google Cloud

Site Reliability Developer role at Google Cloud focusing on building and maintaining large-scale distributed systems with competitive compensation and growth opportunities.

Technical Program Manager, Site Reliability Engineering

Technical Program Manager position at Google's SRE team, leading infrastructure and service delivery projects with focus on operational excellence and cross-functional collaboration.

Program Manager, Platforms and Devices Site Reliability Engineering

Lead complex technical programs for Google's Platforms and Devices SRE team, managing cross-functional projects and driving organizational efficiency.

Site Reliability Engineer

Site Reliability Engineer position at Google Dublin, focusing on building and maintaining large-scale distributed systems with emphasis on reliability and automation.

Software Engineer III, Shopping Build Site Reliability Engineer

Site Reliability Engineer role at Google focusing on building and maintaining large-scale distributed systems for Google Cloud services.