Google Cloud Site Reliability Engineer

66degrees is a leading consulting and professional services company specializing in developing AI-focused, data-led solutions leveraging the latest advancements in cloud technology.
Site Reliability
Senior Software Engineer
Hybrid
4+ years of experience
AI · Enterprise SaaS

Description For Google Cloud Site Reliability Engineer

66degrees is a leading consulting and professional services company specializing in developing AI-focused, data-led solutions leveraging the latest advancements in cloud technology. We help the world's leading brands transform their business challenges into opportunities and shape the future of work. Our Managed Cloud Optimization (MCO) team works with some of the largest cloud users globally to help them transform their businesses with technology.

As a Google Cloud Site Reliability Engineer (SRE) at 66degrees, you'll combine Google Cloud Platform expertise with a passion for DevOps methodologies to help our clients maintain, optimize, and scale their cloud implementations. Your daily responsibilities will include solving critical outages, designing and deploying new cloud workloads, and building self-healing automation.

You'll work with cutting-edge Google Cloud technologies like Google Kubernetes Engine (GKE), Anthos, BigQuery, and data pipelines, as well as leading 3rd party tools like Prometheus, Datadog, and many others. You'll also use languages like Python and Terraform to create automation, deploy infrastructure, and contribute to open-sourcing.

This role offers the opportunity to continually build and apply your Google Cloud expertise to new and varied environments while acting as a key contributor to building the best Google consulting partner in the industry. The position includes a weekend on-call rotation and prefers candidates in Pacific and Mountain Time Zones.

Join 66degrees if you're looking for a challenging and rewarding career in cloud technology, where you can make a significant impact and grow both professionally and personally.

Last updated a month ago

Responsibilities For Google Cloud Site Reliability Engineer

  • Ensuring near-zero downtime with monitoring and alerting, self-healing automation, and continuous improvement
  • Create highly automated, available and scalable systems by applying software and infrastructure principles
  • Employ and advise clients on DevOps and SRE principles and practices
  • Provide a proactive approach to our clients' workloads, anticipating failures, automating tasks, ensuring availability, and providing a great customer experience
  • Work closely with clients, your team, and Google engineers to investigate and resolve infrastructure issues
  • Manage a Jira queue of inbound requests for numerous clients while effectively balancing and prioritizing projects
  • Contribute to ad-hoc initiatives such as writing documentation, open-sourcing, and improving operation

Requirements For Google Cloud Site Reliability Engineer

Python
Kubernetes
Linux
  • Minimum 4+ years of cloud and infrastructure experience, including demonstrated expertise with Linux, Windows, k8s, databases, and networking services
  • 2+ solid years of full-time Google Cloud experience
  • Proficiency with Python required. Other programming language experience is a plus
  • Strong provisioning and configuration skills using Terraform
  • Experience in troubleshooting that spans systems, network, and code
  • Experience with 24x7x365 monitoring, incident response, and on-call support preferred
  • Experience determining & negotiating Error budgets, SLIs, SLOs, and SLAs with product owners
  • Ability to work independently and as a member of a greater team, including cross-team activities
  • Experience working in Agile Scrum, Kanban methodologies in SDLC
  • Proven experience balancing service reliability, metrics, sustainability, technical debt, and operational toil for live services running at scale
  • Strong communication skills, as this is a heavily customer-facing role
  • Bachelor's degree in computer science, electrical engineering, or equivalent

Interested in this job?

Jobs Related To 66degrees Google Cloud Site Reliability Engineer

Site Reliability Engineer L4/L5 - Live Streaming Pipeline

Netflix is hiring a Senior Site Reliability Engineer for their Live Streaming Pipeline, offering remote work and competitive compensation.

CDN Site Reliability Engineer (SRE) L4/L5

Netflix seeks CDN Site Reliability Engineer to design, scale, and operate global content delivery network, ensuring seamless streaming for millions.

Site Reliability Engineer - REST API

Apple is hiring a Site Reliability Engineer for their Vision Pro team to support event operations, focusing on API integration and automation.