Systems Engineer, Site Reliability Engineering

Google

Google is a global technology leader that develops innovative products and services used by billions of people worldwide.

Dublin, Ireland

Site Reliability

Mid-Level Software Engineer

Contact Company

2+ years of experience

Enterprise SaaS · Cloud

Description For Systems Engineer, Site Reliability Engineering

Site Reliability Engineering (SRE) at Google combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. As an SRE, you'll ensure Google Cloud's services maintain reliability and appropriate uptime while monitoring system capacity and performance. The role involves optimizing existing systems, building infrastructure, and automating processes. You'll tackle unique scaling challenges specific to Google Cloud, applying expertise in coding, algorithms, and large-scale system design. The team values intellectual curiosity, problem-solving, and openness, bringing together diverse perspectives in a blame-free environment. You'll work on meaningful projects with support and mentorship for growth. Behind the scenes, you'll be part of the Technical Infrastructure team that maintains Google's data centers and platforms, keeping networks running optimally. This role offers opportunities to shape service lifecycle, provide technical guidance, and drive system improvements while working with cutting-edge distributed systems technology.

Last updated 4 days ago

Responsibilities For Systems Engineer, Site Reliability Engineering

Improve the whole lifecycle of services from inception and design, through deployment, operation, and refinement
Manage support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews
Provide guidance to other team members on managing availability and performance of mission services
Maintain services once they are live by measuring and monitoring availability, latency, and overall system health
Scale systems sustainably through mechanisms like automation and evolve systems by driving changes that improve reliability and velocity

Requirements For Systems Engineer, Site Reliability Engineering

Linux

Python

Java

Bachelor's degree in Computer Science, a related field, or equivalent practical experience
2 years of experience with programming in one or more programming languages
2 years of experience with Unix/Linux systems internals and administrations or networking
Experience with working in computing, distributed systems, storage, or networking
Experience in designing, analyzing, and troubleshooting distributed systems
Ability to debug, optimize code, and automate routine tasks
Excellent problem-solving and communication skills

Google

Google is a global technology leader that develops innovative products and services used by billions of people worldwide.

Dublin, Ireland

Site Reliability

Mid-Level Software Engineer

Contact Company

2+ years of experience

Enterprise SaaS · Cloud

Google

How would you sort an unsorted array of integers in ascending order using the merge sort algorithm, without using built-in sorting functions? Explain merge sort, its implementation, efficiency and how it can be modified for descending order or implemented iteratively.

Data Structures & AlgorithmsHard

Given an unsorted array of integers, write a function to sort the array in ascending order using the merge sort algorithm. Explanation of Merge Sort:** Briefly explain the merge sort algorithm, highlighting its divide-and-conquer approach. Implementation:** Provide a step-by-step implementation of the merge sort algorithm. This includes: A mergeSort function that recursively divides the array into smaller subarrays. A merge function that merges two sorted subarrays into a single sorted array. Example:** Input array: [38, 27, 43, 3, 9, 82, 10] Expected output: [3, 9, 10, 27, 38, 43, 82] Efficiency:** Discuss the time and space complexity of merge sort. Why is merge sort considered an efficient sorting algorithm? How does it compare to other sorting algorithms like bubble sort or quicksort in terms of performance (best, average, and worst-case scenarios)? Constraints:** You are not allowed to use built-in sorting functions. The array can contain positive and negative integers. The solution must be implemented in a language of your choice (e.g., Python, Java, C++). Follow-up:** How would you modify the merge sort algorithm to sort the array in descending order? Could you implement an iterative version of merge sort instead of the recursive one? What are the trade-offs?

Arrays

Recursion

Google

Design a rate limiter for an API.

System DesignMedium

Let's design a rate limiter for an API. The rate limiter should: Limit the number of requests a user can make to an API within a given time window. Support different rate limits for different users or API endpoints. For example, a free tier user might be limited to 10 requests per minute, while a premium user can make 100 requests per minute. Similarly, some API endpoints might have lower rate limits than others due to resource constraints. Be highly performant to avoid adding significant latency to API requests. Be scalable to handle a large number of users and API requests. Be reliable and prevent accidental overloads that could cause service disruptions. Describe the architecture, algorithms, and data structures you would use to implement such a rate limiter. Consider the following: How would you store the rate limit counters for each user or endpoint? What algorithm would you use to determine whether a request should be allowed or throttled? How would you handle distributed rate limiting across multiple servers? How would you ensure the rate limiter is resilient to failures? As an example, imagine a scenario where a social media platform wants to prevent users from spamming their API with excessive requests to post updates. The platform has millions of users and thousands of API endpoints. Some users pay for premium access with higher request limits. How would you design a robust and efficient rate limiter to handle this scale and these varying requirements?

Arrays

Strings

Two Pointers

Stacks

Binary Search

Sliding Windows

Linked Lists

Trees

Recursion

Graphs

Dynamic Programming

Greedy Algorithms

Bit Manipulation

Database Problems

Google

Where do you see yourself in 5 years?

Behavioral

Where do you see yourself in 5 years?

Interested in this job?

Jobs Related To Google Systems Engineer, Site Reliability Engineering

Software Developer III, Site Reliability Development, Google Cloud

Google

Site Reliability Developer role at Google Cloud focusing on building and maintaining large-scale distributed systems with competitive compensation and growth opportunities.

Technical Program Manager, Site Reliability Engineering

Google

Technical Program Manager position at Google's SRE team, leading infrastructure and service delivery projects with focus on operational excellence and cross-functional collaboration.

Program Manager, Platforms and Devices Site Reliability Engineering

Google

Lead complex technical programs for Google's Platforms and Devices SRE team, managing cross-functional projects and driving organizational efficiency.

Site Reliability Engineer

Google

Site Reliability Engineer position at Google Dublin, focusing on building and maintaining large-scale distributed systems with emphasis on reliability and automation.

Software Engineer III, Shopping Build Site Reliability Engineer

Google

Site Reliability Engineer role at Google focusing on building and maintaining large-scale distributed systems for Google Cloud services.