Site Reliability Engineer

Cloud-based, all-in-one white-label marketing and sales platform serving 60K+ agencies and 500K businesses globally
Site Reliability
Mid-Level Software Engineer
Remote
1,000 - 5,000 Employees
3+ years of experience
Enterprise SaaS

Description For Site Reliability Engineer

HighLevel is a leading cloud-based marketing and sales platform serving over 60K agencies and 500K businesses globally. Operating at an impressive scale with 40 billion API hits and 120 billion events monthly, the platform manages 200+ terabytes of application data and 6 petabytes of storage across 500 micro-services.

As a Site Reliability Engineer, you'll join a dynamic team focused on maintaining and improving system reliability and performance. You'll work with cutting-edge technologies including GCP, AWS, Kubernetes, and various monitoring tools to ensure the platform's robust operation.

The role offers an opportunity to work with a diverse, global team of ~1200 employees across 15 countries. HighLevel values work-life balance and maintains a strong company culture that fosters creativity and collaboration, whether you're working remotely or from their Dallas headquarters.

This position is perfect for engineers passionate about large-scale systems, automation, and infrastructure as code. You'll be instrumental in maintaining and improving the platform's reliability, working with modern tools and technologies while solving complex challenges at scale.

HighLevel is committed to diversity and inclusion, creating an environment where talented individuals from all backgrounds can thrive. The company offers a unique opportunity to work on enterprise-scale challenges while maintaining a collaborative and inclusive culture.

Last updated 2 months ago

Responsibilities For Site Reliability Engineer

  • Develop and improve observability using monitoring, logging, tracing, and alerting tools
  • Optimize system performance, troubleshoot incidents, and conduct post-mortems/RCA
  • Collaborate with developers to enhance application reliability, scalability, and performance
  • Drive cost optimisation efforts in cloud environments
  • Monitor multiple databases (MongoDB, Redis, ES, Queue based etc.)

Requirements For Site Reliability Engineer

Kubernetes
MongoDB
Python
Redis
  • 3+ years in Site Reliability Engineering, DevOps, or Cloud Infrastructure roles
  • Hands-on experience with GCP and AWS
  • Experience with Terraform, Helm, or equivalent tools
  • Experience with Docker, Kubernetes (GKE)
  • Experience with Prometheus, Grafana, ELK, OpenTelemetry, or similar monitoring/logging tools
  • Proficiency in Python, Bash, or Shell scripting
  • Basic understanding of API parsing and JSON manipulation
  • Experience with Jenkins, GitHub Actions, ArgoCD, or similar tools
  • Experience with on-call rotations, SLOs, SLIs, SLAs
  • Experience in monitoring MongoDB, Redis, ES, Queue based systems

Interested in this job?

Jobs Related To HighLevel Site Reliability Engineer

Software Engineer III, Site Reliability Engineering, Google Cloud

Site Reliability Engineer position at Google Cloud, focusing on building and maintaining large-scale distributed systems with emphasis on reliability and automation.

Software Engineer II, Site Reliability Engineering

Join Google as an SRE II to build and maintain large-scale distributed systems, ensure service reliability, and drive automation and optimization for Google Cloud services.

Site Reliability Engineer 2

Senior Site Reliability Engineer position at Microsoft focusing on Skype infrastructure, requiring Linux expertise and cloud experience with up to 100% remote work option.

Software Engineer III, Site Reliability Engineering, Google Cloud

Google Cloud SRE position focusing on building and maintaining large-scale distributed systems with emphasis on reliability and performance.

Site Reliability Engineer - CTJ - Top Secret

Site Reliability Engineer position at Microsoft focusing on O365 government cloud services, requiring TS clearance and strong software engineering background.