Site Reliability Engineer - EP (SE4)

GoTo Group is a technology company that owns and manages Gojek's engineering productivity across the board, powering over 500+ microservices.
United StatesGurugram, Haryana, India
Site Reliability
Staff Software Engineer
In-Person
1,000 - 5,000 Employees
8+ years of experience

Description For Site Reliability Engineer - EP (SE4)

At Engineering Platform - Site Reliability Engineering, Gojek, we are seeking passionate engineers to join us in improving and managing Gojek's engineering productivity, reliability, and observability across the board. The platform you'll work on is designed to power diverse applications across Gojek's many business lines. You'll be part of a driven team of engineers who deliver fundamental functionality to enable multiple product groups at Gojek to deal with scenarios at an interesting combination of scale and complexity.

As a Site Reliability Engineer, you will be directly responsible for improving engineering quality, productivity, and the experience of engineers driving fundamental business KPIs for the company. Your role will involve cloud administration, automation, DevOps practices, and Kubernetes administration.

Key responsibilities include:

  1. Cloud Administration: Hands-on administration of cloud-based infrastructure deployment.
  2. Automation: Designing and building SRE tooling to automate monitoring, incident response, and alerting.
  3. DevOps: Proficiency in CI/CD tools and infrastructure automation.
  4. Kubernetes Administration: Deploying and managing applications on Kubernetes.
  5. Infrastructure as Code (IaC): Experience with tools like Terraform, Terragrunt, and CloudFormation.
  6. Networking: Proficiency with Cloud Load Balancers and various cloud networking features.

The ideal candidate will have 8+ years of experience in SRE or DevOps (with at least 5+ in a large enterprise Cloud), strong hands-on experience with Kubernetes, deep knowledge of Linux and container technologies, and a solid understanding of networking concepts and protocols.

Join us at GoTo Group, where we leverage cutting-edge technology in cloud computing to manage real-time high throughput systems with a wide range of programming stacks. Be part of a team that solves for the happiness of our customers - Gojek Product Engineers - by designing abstractions and automations.

Last updated 4 months ago

Responsibilities For Site Reliability Engineer - EP (SE4)

  • Cloud Administration: Administering cloud-based infrastructure deployment
  • Automation: Designing and building SRE tooling to automate monitoring, incident response, and alerting
  • DevOps: Proficiency in CI/CD tools and infrastructure automation
  • Kubernetes Administration: Deploying and managing applications on Kubernetes
  • Infrastructure as Code (IaC): Experience with tools like Terraform, Terragrunt, and CloudFormation
  • Networking: Proficiency with Cloud Load Balancers and various cloud networking features

Requirements For Site Reliability Engineer - EP (SE4)

Kubernetes
Linux
  • 8+ years of experience in SRE or DevOps space (at least 5+ in a large enterprise Cloud)
  • Experience maintaining and operating large-scale applications in cloud platforms such as AWS or GCP
  • Strong hands-on experience in Kubernetes
  • Deep knowledge of Linux as a production environment, and container technologies, e.g. Docker
  • Ability to automate repetitive tasks and familiarity with scripting languages
  • Strong understanding of infrastructure-as-code principles and best practices such as Terraform
  • Solid understanding of networking concepts and protocols
  • Understanding of microservices architecture, event-driven architecture, Chef/Ansible and CI/CD
  • Strong technical aptitude including excellent troubleshooting and communication skills

Interested in this job?

Jobs Related To GoTo Group Site Reliability Engineer - EP (SE4)

Staff Software Engineer, Reliability Engineering

Staff Software Engineer position at Airbnb focusing on Site Reliability Engineering, developing and maintaining tools for service reliability at scale.

Sr Staff Software Engineer, Reliability Engineering

Senior Staff SRE position at Airbnb focusing on reliability architecture, incident management, and technical leadership, offering competitive compensation and remote work flexibility.

Senior Site Reliability Engineer

Remote Senior Site Reliability Engineer position at ZayZoon, focusing on AWS infrastructure and production deployments across Canada.

Site Reliability Engineering II

Senior Site Reliability Engineer position at Microsoft focusing on identity and security engineering, requiring 5+ years of experience in identity technologies and security infrastructure.

Site Reliability Manager, Core Enterprise Systems

Lead a team of SRE engineers at Google, managing enterprise services and driving reliability improvements across critical internal systems.