Google Cloud Site Reliability Engineer

66degrees

66degrees is a leading consulting and professional services company specializing in developing AI-focused, data-led solutions leveraging the latest advancements in cloud technology.

San Francisco, CA, USA

Site Reliability

Senior Software Engineer

Hybrid

4+ years of experience

AI · Enterprise SaaS

This job posting may no longer be active. You may be interested in these related jobs instead:

Description For Google Cloud Site Reliability Engineer

66degrees is a leading consulting and professional services company specializing in developing AI-focused, data-led solutions leveraging the latest advancements in cloud technology. We help the world's leading brands transform their business challenges into opportunities and shape the future of work. Our Managed Cloud Optimization (MCO) team works with some of the largest cloud users globally to help them transform their businesses with technology.

As a Google Cloud Site Reliability Engineer (SRE) at 66degrees, you'll combine Google Cloud Platform expertise with a passion for DevOps methodologies to help our clients maintain, optimize, and scale their cloud implementations. Your daily responsibilities will include solving critical outages, designing and deploying new cloud workloads, and building self-healing automation.

You'll work with cutting-edge Google Cloud technologies like Google Kubernetes Engine (GKE), Anthos, BigQuery, and data pipelines, as well as leading 3rd party tools like Prometheus, Datadog, and many others. You'll also use languages like Python and Terraform to create automation, deploy infrastructure, and contribute to open-sourcing.

This role offers the opportunity to continually build and apply your Google Cloud expertise to new and varied environments while acting as a key contributor to building the best Google consulting partner in the industry. The position includes a weekend on-call rotation and prefers candidates in Pacific and Mountain Time Zones.

Join 66degrees if you're looking for a challenging and rewarding career in cloud technology, where you can make a significant impact and grow both professionally and personally.

Last updated 9 months ago

Responsibilities For Google Cloud Site Reliability Engineer

Ensuring near-zero downtime with monitoring and alerting, self-healing automation, and continuous improvement
Create highly automated, available and scalable systems by applying software and infrastructure principles
Employ and advise clients on DevOps and SRE principles and practices
Provide a proactive approach to our clients' workloads, anticipating failures, automating tasks, ensuring availability, and providing a great customer experience
Work closely with clients, your team, and Google engineers to investigate and resolve infrastructure issues
Manage a Jira queue of inbound requests for numerous clients while effectively balancing and prioritizing projects
Contribute to ad-hoc initiatives such as writing documentation, open-sourcing, and improving operation

Requirements For Google Cloud Site Reliability Engineer

Python

Kubernetes

Linux

Minimum 4+ years of cloud and infrastructure experience, including demonstrated expertise with Linux, Windows, k8s, databases, and networking services
2+ solid years of full-time Google Cloud experience
Proficiency with Python required. Other programming language experience is a plus
Strong provisioning and configuration skills using Terraform
Experience in troubleshooting that spans systems, network, and code
Experience with 24x7x365 monitoring, incident response, and on-call support preferred
Experience determining & negotiating Error budgets, SLIs, SLOs, and SLAs with product owners
Ability to work independently and as a member of a greater team, including cross-team activities
Experience working in Agile Scrum, Kanban methodologies in SDLC
Proven experience balancing service reliability, metrics, sustainability, technical debt, and operational toil for live services running at scale
Strong communication skills, as this is a heavily customer-facing role
Bachelor's degree in computer science, electrical engineering, or equivalent