Site Reliability Engineer

Hyperconnect

A global technology company developing real-time communication platforms and services across 230+ countries.

Seoul, South Korea

Site Reliability

Senior Software Engineer

Hybrid

5+ years of experience

Enterprise SaaS

Description For Site Reliability Engineer

Hyperconnect's Platform Department is seeking a Site Reliability Engineer to join their team that provides infrastructure and common platform technology across all services including Azar and new products. The SRE team's mission is to ensure all services developed at Hyperconnect remain stable, allowing users to enjoy special experiences without interruption. Working with AWS, Kubernetes, and Service mesh, you'll manage modern computing and network infrastructure across all services and systems. The role goes beyond simple infrastructure management, allowing deep contribution to backend engineering. Given the real-time nature of the business, you'll work on high-performance, low-latency systems. You'll gain experience managing large-scale infrastructure in a global environment, handling multi-products, and working with both B2B and B2C environments. The team uses cutting-edge tools like Terraform, Helm, ArgoCD, and Spinnaker for infrastructure management, and implements comprehensive monitoring solutions using Zabbix, Prometheus, OpenTelemetry, and Elasticsearch. You'll be part of a team that values automation, continuous improvement, and proactive problem-solving, while working in a collaborative environment that spans multiple technical teams and stakeholders.

Last updated 2 months ago

Responsibilities For Site Reliability Engineer

Build and operate high-availability system infrastructure in AWS cloud environment
Implement and manage system/application logging, monitoring, and automation using tools like Zabbix and Prometheus
Lead incident response and postmortem culture
Identify and optimize service improvements based on SLO/SLI metrics
Conduct PoCs for new technologies and implement them in production
Manage and improve monitoring systems using OpenTelemetry and Elasticsearch
Support 300+ microservices with application monitoring

Requirements For Site Reliability Engineer

Kubernetes

Python

Linux

Strong understanding of CS fundamentals, especially Linux and Networking
Understanding of container technologies
Programming ability in Python, Golang
Practical experience with Linux servers in public cloud environments (AWS)
Excellent communication skills and documentation ability
Ability to identify and proactively solve various service issues
Enthusiasm for learning new technologies

Hyperconnect

A global technology company developing real-time communication platforms and services across 230+ countries.

Seoul, South Korea

Site Reliability

Senior Software Engineer

Hybrid

5+ years of experience

Enterprise SaaS

Interested in this job?

Site Reliability Engineer

Hyperconnect

Description For Site Reliability Engineer

Responsibilities For Site Reliability Engineer

Requirements For Site Reliability Engineer

Hyperconnect

Jobs Related To Hyperconnect Site Reliability Engineer