Software Engineer III, Site Reliability Engineering

Google

Google is a global technology company that builds and runs large-scale, massively distributed systems.

Dublin, Ireland

$120,000 - $200,000

Site Reliability

Mid-Level Software Engineer

In-Person

5,000+ Employees

2+ years of experience

Enterprise SaaS · Cloud

Description For Software Engineer III, Site Reliability Engineering

Site Reliability Engineering (SRE) at Google combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. As an SRE Engineer III, you'll be responsible for ensuring Google Cloud's services maintain reliability and appropriate uptime for customer needs while driving continuous improvement. The role involves managing large-scale systems unique to Google Cloud, focusing on optimizing existing systems, building infrastructure, and implementing automation.

The position requires strong coding skills, understanding of algorithms, and expertise in large-scale system design. You'll be working in a diverse and collaborative environment that values intellectual curiosity and problem-solving. Google's SRE culture promotes self-direction and risk-taking in a blame-free environment, while providing necessary support and mentorship for professional growth.

Key aspects of the role include managing project priorities, deadlines, and deliverables, as well as designing, developing, testing, deploying, maintaining, and enhancing software solutions. You'll be part of a team that maintains an ever-watchful eye on systems capacity and performance, working with both internally critical and externally-visible systems.

The role offers unique opportunities to work with Google's massive infrastructure, collaborate with diverse teams, and contribute to the reliability of services used by millions. You'll be expected to participate in code reviews, contribute to documentation, and work on solving complex distributed systems challenges. The position combines technical expertise with system reliability, making it ideal for engineers passionate about both software development and operations at scale.

Last updated 2 months ago

Responsibilities For Software Engineer III, Site Reliability Engineering

Write product or system development code
Review code developed by other engineers and provide feedback
Contribute to existing documentation or educational content
Triage product or system issues and debug/track/resolve issues
Participate in, or lead design reviews with peers and stakeholders

Requirements For Software Engineer III, Site Reliability Engineering

Python

Java

Linux

Kubernetes

Bachelor's degree in Computer Science, a related field, or equivalent practical experience
2 years of experience with data structures/algorithms and software development
Experience in one or more programming languages
Master's degree in Computer Science or Engineering (preferred)
2 years of experience in designing, analyzing, and troubleshooting distributed systems (preferred)

Benefits For Software Engineer III, Site Reliability Engineering

Medical Insurance

Dental Insurance

Vision Insurance

Equal opportunity employer
Accommodation for special needs

Google

Google is a global technology company that builds and runs large-scale, massively distributed systems.

Dublin, Ireland

$120,000 - $200,000

Site Reliability

Mid-Level Software Engineer

In-Person

5,000+ Employees

2+ years of experience

Enterprise SaaS · Cloud

Google

How would you implement a placement system for cars in a racing game with a circular track?

Data Structures & AlgorithmsHard

Imagine you are developing a racing game. You need to implement a system that determines the placement of each car in the race at any given moment. The race track is a circular track of a fixed length. Each car has a unique ID and a current distance traveled from the starting line. Design a function that takes a list of cars, where each car is represented by its ID and distance traveled, and returns the current placement of each car, considering the circular nature of the track. For example, if the track length is 1000 meters and a car has traveled 1200 meters, its actual position on the track is 200 meters. Consider edge cases such as multiple cars having the same position. Can you implement a function that efficiently calculates and assigns placements to the cars based on their positions on the track?

Arrays

Greedy Algorithms

Google

Design a URL shortening system.

System DesignMedium

Let's design a system for URL shortening, like TinyURL. Assume that we need to handle a large number of requests, say billions of URLs per day. Consider the following: Functional Requirements: The system should generate a shorter, unique alias for a given URL. Users should be able to enter a shortened URL and be redirected to the original URL. The shortened URLs should be relatively short. Non-Functional Requirements: The system should be highly available. URL redirection should be as fast as possible. The shortened URLs should be unique. The system should be scalable to handle a large number of URLs and requests. Considerations: How would you design the data model to store the mappings between shortened and original URLs? What algorithms would you use to generate the shortened URLs? Consider the trade-offs between different algorithms. How would you handle collisions (when the same shortened URL is generated for two different original URLs)? What kind of database would you use and why? How would you handle the high volume of traffic and ensure low latency for redirection? Consider caching strategies. How would you scale the system to handle future growth? Walk me through your design, explaining your choices and the rationale behind them. Include diagrams and specific technologies where appropriate. Explain how you would ensure the system meets the requirements for availability, performance, and scalability.

Database Problems

Arrays

Strings

Google

Describe a time you faced a tough technical problem. What made it difficult, and how did you solve it?

Behavioral

Describe a time you faced a tough technical problem. What made it so difficult, and how did you approach solving it? What resources did you consult, and what was the final outcome? What did you learn from this experience, and how has it shaped your problem-solving skills since then? For example, perhaps you were tasked with optimizing a slow-running query in a large database. The initial attempts to improve performance, such as adding indexes, didn't yield the desired results. You then had to dive deep into the query execution plan, identify bottlenecks, and rewrite the query using more efficient techniques, like leveraging window functions or temporary tables. You could also describe the challenges faced while integrating a new feature into a legacy system with poorly documented code. You might have had to reverse engineer parts of the system, deal with unexpected dependencies, and carefully test the integration to avoid breaking existing functionality. Elaborate on the specific tools and methodologies you employed during these challenges, such as debuggers, profilers, or version control systems. Also, explain how you collaborated with team members or sought assistance from external resources like online forums or documentation to overcome the obstacles.

Arrays

Strings

Two Pointers

Stacks

Binary Search

Sliding Windows

Linked Lists

Trees

Recursion

Graphs

Dynamic Programming

Greedy Algorithms

Bit Manipulation

Database Problems

Interested in this job?

Jobs Related To Google Software Engineer III, Site Reliability Engineering

Software Developer III, Site Reliability Development, Google Cloud

Google

Site Reliability Developer role at Google Cloud focusing on building and maintaining large-scale distributed systems with competitive compensation and growth opportunities.

Technical Program Manager, Site Reliability Engineering

Google

Technical Program Manager position at Google's SRE team, leading infrastructure and service delivery projects with focus on operational excellence and cross-functional collaboration.

Program Manager, Platforms and Devices Site Reliability Engineering

Google

Lead complex technical programs for Google's Platforms and Devices SRE team, managing cross-functional projects and driving organizational efficiency.

Site Reliability Engineer

Google

Site Reliability Engineer position at Google Dublin, focusing on building and maintaining large-scale distributed systems with emphasis on reliability and automation.

Software Engineer III, Shopping Build Site Reliability Engineer

Google

Site Reliability Engineer role at Google focusing on building and maintaining large-scale distributed systems for Google Cloud services.