Site Reliability Engineer (SRE)

Cairo, Cairo Governorate, EgyptAlexandria, Alexandria Governorate, EgyptRiyadh Saudi Arabia
Site Reliability
Mid-Level Software Engineer
Remote
3+ years of experience
Enterprise SaaS

Description For Site Reliability Engineer (SRE)

Lucidya is seeking a Site Reliability Engineer (SRE) to join their Cloud Engineering team. This role focuses on enhancing the reliability, scalability, and automation of cloud-based infrastructure. The ideal candidate will bring hands-on experience with cloud environments, containerized workloads, and monitoring systems.

The position involves managing critical infrastructure components, ensuring high availability, and implementing automation across cloud platforms like AWS, GCP, or Azure. Key responsibilities include Kubernetes cluster management, monitoring implementation using tools like Datadog or Grafana, and developing automation scripts for operational efficiency.

The role requires 3 years of experience in SRE/DevOps, strong cloud platform expertise, and proficiency in technologies like Kubernetes, Python, and Infrastructure as Code tools. The successful candidate will participate in on-call rotations, collaborate with various teams, and contribute to establishing best practices for infrastructure reliability.

This remote position offers the opportunity to work with modern cloud technologies while making a significant impact on system reliability and performance. The role combines technical expertise with problem-solving skills, making it ideal for engineers passionate about infrastructure automation and system reliability.

Last updated a day ago

Responsibilities For Site Reliability Engineer (SRE)

  • Ensure high availability (HA) and scalability of critical infrastructure components
  • Proactively identify and eliminate single points of failure across the cloud environment
  • Manage and optimize cloud-based workloads across AWS, GCP, or Azure
  • Automate provisioning, scaling, and maintenance tasks using Infrastructure as Code
  • Manage Kubernetes clusters operations including deployment, scaling, upgrades, and troubleshooting
  • Implement and standardize monitoring solutions
  • Participate in on-call rotations and troubleshoot incidents
  • Develop and maintain automation scripts for routine operational tasks
  • Work closely with DevOps and Engineering teams

Requirements For Site Reliability Engineer (SRE)

Kubernetes
Python
Linux
  • 3 years of experience in a similar SRE, DevOps, or Infrastructure Engineer role
  • Strong experience with at least one major cloud provider (AWS, GCP, or Azure)
  • Hands-on experience with Kubernetes and containerization
  • Proficient in scripting languages such as Python, Bash, or similar for automation
  • Experience with Infrastructure as Code (IaC) tools
  • Strong understanding of load balancers, networking, and HA architecture
  • Experience with CI/CD tools
  • Experience with modern monitoring and observability tools
  • Strong analytical skills and ability to resolve complex technical issues
  • Excellent communication and collaboration skills

Interested in this job?

Jobs Related To Lucidya Site Reliability Engineer (SRE)

Site Reliability Engineer (SRE)

Remote Site Reliability Engineer position at Lucidya, focusing on cloud infrastructure, Kubernetes, and automation with 3 years of experience required.

Site Reliability Engineer - CTJ - TS/SCI

Microsoft Site Reliability Engineer position supporting Azure Local and DHI for US Government customers, requiring TS/SCI clearance and cloud infrastructure expertise.

Site Reliability Engineer II - CTJ - Poly

Site Reliability Engineer II position at Microsoft focusing on managing and automating large-scale Commerce platform within Azure and Office ecosystems.

Site Reliability Engineer (SRE)

Remote Site Reliability Engineer position at Lucidya, focusing on cloud infrastructure, Kubernetes, and automation with 3 years of experience required.

Site Reliability Engineer

Site Reliability Engineer role at commercetools focusing on multi-cloud infrastructure, Kubernetes, and automation with hybrid work model.