Senior Software Developer, Site Reliability Engineering, Google Cloud

Google

Google is a global technology company that builds and maintains technical infrastructure powering their product portfolio through data centers and platforms.

Waterloo, ON, Canada

Site Reliability

Senior Software Engineer

In-Person

5,000+ Employees

5+ years of experience

Enterprise SaaS · Cloud

Description For Senior Software Developer, Site Reliability Engineering, Google Cloud

Site Reliability Development at Google combines software and systems development to build and run large-scale, massively distributed, fault-tolerant systems. The role focuses on ensuring Google's services maintain reliability and appropriate uptime while monitoring system capacity and performance. As an SRE, you'll work on optimizing existing systems, building infrastructure, and automating processes.

The position offers unique challenges of scale specific to Google, requiring expertise in coding, algorithms, complexity analysis, and large-scale system design. Google's SRE culture promotes intellectual curiosity, problem-solving, and openness, bringing together diverse perspectives in a blame-free environment. The team encourages self-direction on meaningful projects while providing support and mentorship for growth.

The Technical Infrastructure team, which includes SRE, is fundamental to Google's operations, developing and maintaining data centers and building next-generation platforms. The role involves working with cutting-edge technology and ensuring users have the best possible experience. This position offers the opportunity to work with complex systems at scale, collaborate with talented engineers, and directly impact Google's global infrastructure.

The ideal candidate will combine technical expertise with leadership capabilities, working across the entire service lifecycle from design to deployment and optimization. This role is perfect for engineers who enjoy solving complex distributed systems challenges, are passionate about automation and system reliability, and want to work at the forefront of large-scale technical infrastructure.

Last updated 9 hours ago

Responsibilities For Senior Software Developer, Site Reliability Engineering, Google Cloud

Engage in and improve the whole lifecycle of services—from inception and design, through to deployment, operation and refinement
Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews
Maintain services once they are live by measuring and monitoring availability, latency and overall system health
Scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity
Practice sustainable incident response and blameless postmortems

Requirements For Senior Software Developer, Site Reliability Engineering, Google Cloud

Linux

Python

Bachelor's degree in Computer Science, a related field, or equivalent practical experience
5 years of experience with software development in one or more programming languages
5 years of experience with data structures or algorithms
3 years of experience in designing, analyzing, and troubleshooting large-scale distributed systems
2 years of experience leading projects and providing technical leadership
Master's degree in Computer Science or Engineering (preferred)

Benefits For Senior Software Developer, Site Reliability Engineering, Google Cloud

Medical Insurance

Vision Insurance

Dental Insurance

Parental Leave

Equal opportunity employer
Accommodation for special needs
Global work environment

Google

Google is a global technology company that builds and maintains technical infrastructure powering their product portfolio through data centers and platforms.

Waterloo, ON, Canada

Site Reliability

Senior Software Engineer

In-Person

5,000+ Employees

5+ years of experience

Enterprise SaaS · Cloud

Google

How would you sort an unsorted array of integers in ascending order using the merge sort algorithm, without using built-in sorting functions? Explain merge sort, its implementation, efficiency and how it can be modified for descending order or implemented iteratively.

Data Structures & AlgorithmsHard

Given an unsorted array of integers, write a function to sort the array in ascending order using the merge sort algorithm. Explanation of Merge Sort:** Briefly explain the merge sort algorithm, highlighting its divide-and-conquer approach. Implementation:** Provide a step-by-step implementation of the merge sort algorithm. This includes: A mergeSort function that recursively divides the array into smaller subarrays. A merge function that merges two sorted subarrays into a single sorted array. Example:** Input array: [38, 27, 43, 3, 9, 82, 10] Expected output: [3, 9, 10, 27, 38, 43, 82] Efficiency:** Discuss the time and space complexity of merge sort. Why is merge sort considered an efficient sorting algorithm? How does it compare to other sorting algorithms like bubble sort or quicksort in terms of performance (best, average, and worst-case scenarios)? Constraints:** You are not allowed to use built-in sorting functions. The array can contain positive and negative integers. The solution must be implemented in a language of your choice (e.g., Python, Java, C++). Follow-up:** How would you modify the merge sort algorithm to sort the array in descending order? Could you implement an iterative version of merge sort instead of the recursive one? What are the trade-offs?

Arrays

Recursion

Google

How would you assign ACLs to users or groups?

System DesignMedium

Let's discuss Access Control Lists (ACLs). Imagine you're designing a system where you need to control access to various resources. How would you approach assigning ACLs to users and groups? Be specific. For example, consider a scenario with files, directories, and applications. How would you define permissions (read, write, execute, delete) and associate them with individual users (like 'john.doe') or groups (like 'developers' or 'administrators')? What different strategies would you evaluate for managing ACLs, and what are the tradeoffs between them in terms of security, performance, and ease of administration? For example, would you use an identity-based approach, a role-based approach, or a combination of both? Consider also how you would handle inheritance of ACLs in a hierarchical structure, such as a file system. How would you prevent privilege escalation and ensure that users only have the access they need? Finally, how would you audit ACL changes and monitor access attempts to detect potential security breaches?

Graphs

Dynamic Programming

Google

Tell me about a time you had to work with a signed contract.

Behavioral

Tell me about a time you had to work with a signed contract. Describe the situation, your role, and the outcome. What were the key clauses or provisions that were most relevant to the situation? What challenges, if any, did you face in interpreting or adhering to the contract terms? How did you ensure that your actions were in compliance with the contract, and what steps did you take to mitigate potential risks or disputes? For example, consider a scenario where you were managing a project with a vendor, and the contract outlined specific deliverables, timelines, and payment terms. A conflict arose when the vendor failed to meet a critical deadline, potentially impacting the project's overall timeline and budget. How did you leverage the contract to address the issue, protect your company's interests, and find a resolution that was fair to both parties? Or, imagine you were involved in a negotiation where the other party wanted to change certain terms after the contract was signed. How did you handle the situation, ensuring that any modifications were properly documented and agreed upon by all parties involved? What did you learn from this experience, and how has it influenced your approach to working with contracts in subsequent projects or situations?

Interested in this job?

Jobs Related To Google Senior Software Developer, Site Reliability Engineering, Google Cloud

Senior Software Developer, Site Reliability Engineering, Google Cloud

Google

Senior SRE role at Google Cloud focusing on building and maintaining large-scale distributed systems with competitive compensation and comprehensive benefits.

Senior Software Engineer, SRE, Cloud Incident Response

Google

Senior SRE position at Google focusing on Cloud Incident Response, requiring expertise in distributed systems and incident management.

Senior Software Engineer, Site Reliability Engineering

Google

Senior Site Reliability Engineering role at Google, focusing on building and maintaining large-scale distributed systems for Google Cloud services.

Senior Software Engineer, Site Reliability Engineering

Google

Senior SRE position at Google focusing on building and maintaining large-scale distributed systems for enterprise applications in Bengaluru.

Senior Software Engineer, Site Reliability Engineering, Google Play

Google

Senior SRE position at Google Play focusing on maintaining and improving reliability, performance, and scalability of gaming services while providing technical leadership.