Site Reliability Engineer

Google

Google is a global technology company that provides a wide range of internet-related services and products.

Dublin, Ireland

Site Reliability

Mid-Level Software Engineer

In-Person

5,000+ Employees

2+ years of experience

Enterprise SaaS · Cloud

Description For Site Reliability Engineer

Google's Site Reliability Engineering (SRE) team is at the forefront of maintaining and optimizing large-scale, distributed systems that power Google Cloud's services. As an SRE, you'll be responsible for ensuring both internal and external systems maintain high reliability and appropriate uptime while constantly improving performance and capacity. The role combines software and systems engineering to build robust, fault-tolerant systems.

You'll work on complex challenges unique to Google Cloud's scale, applying your expertise in coding, algorithms, complexity analysis, and large-scale system design. The team values diversity, intellectual curiosity, and problem-solving in a blame-free environment that encourages collaboration and innovation.

The position offers opportunities to work on meaningful projects with significant impact, including automated troubleshooting, monitoring systems, and service level objectives. You'll be part of a culture that promotes self-direction while providing strong support and mentorship for professional growth.

Key responsibilities include managing project priorities, developing software solutions, and working with partner teams to ensure system reliability. The role requires both technical expertise and strong collaboration skills, as you'll be interfacing with various stakeholders to understand needs and implement solutions.

This is an excellent opportunity for engineers passionate about system reliability, automation, and large-scale infrastructure who want to make a significant impact at one of the world's leading technology companies.

Last updated 13 hours ago

Responsibilities For Site Reliability Engineer

Contribute to land projects like Automated Troubleshooting, Better Monitoring and Service Level Objective (SLOs), Podification of services, etc.
Identify needs across network telemetry services. Propose, build and launch cross-service solutions to satisfy those needs
Motivate improvements in the team's systems, infrastructure around them, and network telemetry ecosystem
Engage with partner teams, users to make systems reliable with relatable SLOs. Guide technical plans and goals towards creating reliable systems
Operate the network telemetry systems of Google production network

Requirements For Site Reliability Engineer

Linux

Kubernetes

Bachelor's degree in Computer Science, a related field, or equivalent practical experience
2 years of experience with data structures/algorithms and software development in one or more programming languages
Experience in software engineering with knowledge of Google production network
Experience with research, propose and launching engineering solutions
Ability to collaborate with current and prospective partner teams, product and users to discover their needs and provide solutions
Excellent collaboration skills with technical goals for the team and partners
Excellent leadership skills

Benefits For Site Reliability Engineer

Medical Insurance

401k

Parental Leave

Equal employment opportunity
Inclusive work environment
Global collaboration opportunities

Google

Google is a global technology company that provides a wide range of internet-related services and products.

Dublin, Ireland

Site Reliability

Mid-Level Software Engineer

In-Person

5,000+ Employees

2+ years of experience

Enterprise SaaS · Cloud

Interested in this job?

Jobs Related To Google Site Reliability Engineer

Software Developer II, Site Reliability Developer, Google Cloud

Google

Site Reliability Developer position at Google Cloud focusing on building and maintaining large-scale distributed systems with competitive compensation and benefits.

Software Developer III, Site Reliability Development, Google Cloud

Google

Software Developer III position at Google Cloud focusing on Site Reliability Development, building and maintaining large-scale distributed systems with competitive compensation and benefits.

Software Developer III, Site Reliability Development, Google Cloud

Google

Software Developer III position focused on Site Reliability Development for Google Cloud, building and maintaining large-scale distributed systems.

Software Engineer III, Site Reliability Engineering, Google Cloud

Google

Software Engineer III position in Google Cloud's Site Reliability Engineering team, focusing on building and maintaining large-scale distributed systems with emphasis on reliability and performance optimization.

Site Reliability Engineer, AlphaNet Edge

Google

Site Reliability Engineer position at Google focusing on building and maintaining large-scale distributed systems for AlphaNet Edge infrastructure.