Site Reliability Engineer

PalUp

AI-driven social platform serving millions of global users

Taipei City, Taiwan

Site Reliability

Mid-Level Software Engineer

In-Person

3+ years of experience

This job posting may no longer be active. You may be interested in these related jobs instead:

Site Reliability Developer 2

Oracle

Mid-level Site Reliability Developer position at Oracle, focusing on cloud infrastructure, automation, and system reliability with 3-5+ years of experience required.

Site Reliability Engineer

PEXA International

Site Reliability Engineer role at PEXA International focusing on platform reliability, incident management, and infrastructure optimization in a remote setting.

Site Reliability Engineer II

Microsoft

Microsoft is seeking a Site Reliability Engineer II to join their Azure Data engineering team to maintain and improve the reliability and scalability of the Microsoft Fabric platform.

Site Reliability Engineer II

Microsoft

Microsoft seeks a Site Reliability Engineer II to secure cloud infrastructure for government clients, offering hybrid work in Redmond with competitive pay and benefits.

Mid Site Reliability Engineer

Zipdev

Remote Site Reliability Engineer position at Zipdev, focusing on building and monitoring cloud services, requiring 3-4 years of experience and Latin America residence.

Description For Site Reliability Engineer

PalUp is revolutionizing social interactions through their AI-driven platform that serves millions of users globally. As a Site Reliability Engineer, you'll be at the heart of their engineering team, ensuring the platform's stability, reliability, and efficiency.

The role demands a skilled engineer with 3+ years of SRE/DevOps experience who excels in cloud services (particularly GCP), Linux administration, and container orchestration with Kubernetes. You'll be working with cutting-edge technologies including Python, Golang, and modern monitoring tools like Grafana and Prometheus.

Your responsibilities will span from designing and implementing monitoring systems to optimizing CI/CD pipelines and managing cloud-based deployments. You'll be crucial in analyzing and improving system performance, ensuring high availability, and developing automation tools to streamline operations.

The ideal candidate values automation, proactive problem-solving, and collaborative teamwork. You'll thrive in PalUp's dynamic environment where innovation and technical excellence are paramount. The company emphasizes creating scalable solutions and empowering teams to deliver world-class experiences.

This is an excellent opportunity for a mid-level engineer passionate about site reliability and DevOps to make a significant impact in a growing AI-focused company. You'll work alongside talented engineers who value collaboration, fairness, and mutual respect, while helping shape the future of AI-driven social interactions.

Last updated 3 months ago

Responsibilities For Site Reliability Engineer

Design, implement, and maintain monitoring and alerting systems to ensure service stability
Maintain and optimize CI/CD pipelines to improve deployment efficiency and reliability
Manage and improve cloud-based deployment processes using Docker, Kubernetes, and related tools
Analyze system bottlenecks and proactively implement architectural and performance optimizations
Collaborate with development teams to ensure high availability and fault tolerance of applications and databases
Develop scripts and automation tools (e.g., Python, Shell scripts) to streamline operational tasks

Requirements For Site Reliability Engineer

Python

Kubernetes

Linux

PostgreSQL

MongoDB

MySQL

3+ years of experience in SRE/DevOps or related roles
Strong expertise in cloud services and infrastructure (GCP preferred, AWS or Azure is a plus)
Solid knowledge of Linux system administration and maintenance
Proficiency in programming languages such as Python or Golang
Hands-on experience with monitoring and alerting systems (Grafana, Prometheus)
Advanced knowledge of Kubernetes and containerization tools like Docker
Familiarity with log management systems and operational configurations
Strong English reading and communication skills for technical documentation