Site Reliability Engineer (SRE)

Cairo, Cairo Governorate, EgyptAlexandria, Alexandria Governorate, EgyptRiyadh Saudi Arabia
Site Reliability
Mid-Level Software Engineer
Remote
3+ years of experience
Enterprise SaaS

Description For Site Reliability Engineer (SRE)

Lucidya is seeking a Site Reliability Engineer (SRE) to join their Cloud Engineering team. This role focuses on enhancing the reliability, scalability, and automation of cloud-based infrastructure. The ideal candidate will bring hands-on experience with cloud environments, containerized workloads, and monitoring systems.

The position involves managing critical infrastructure components, ensuring high availability, and implementing automation across cloud platforms like AWS, GCP, or Azure. Key responsibilities include Kubernetes cluster management, monitoring implementation using tools like Datadog or Grafana, and developing automation scripts for operational efficiency.

The role requires 3 years of experience in SRE/DevOps, strong cloud platform expertise, and proficiency in technologies like Kubernetes, Python, and Infrastructure as Code tools. The successful candidate will participate in on-call rotations, collaborate with various teams, and contribute to establishing best practices for infrastructure reliability.

This remote position offers the opportunity to work with modern cloud technologies while making a significant impact on system reliability and performance. The role combines technical expertise with problem-solving skills, making it ideal for engineers passionate about infrastructure automation and system reliability.

Last updated 2 months ago

Responsibilities For Site Reliability Engineer (SRE)

  • Ensure high availability (HA) and scalability of critical infrastructure components
  • Proactively identify and eliminate single points of failure across the cloud environment
  • Manage and optimize cloud-based workloads across AWS, GCP, or Azure
  • Automate provisioning, scaling, and maintenance tasks using Infrastructure as Code
  • Manage Kubernetes clusters operations including deployment, scaling, upgrades, and troubleshooting
  • Implement and standardize monitoring solutions
  • Participate in on-call rotations and troubleshoot incidents
  • Develop and maintain automation scripts for routine operational tasks
  • Work closely with DevOps and Engineering teams

Requirements For Site Reliability Engineer (SRE)

Kubernetes
Python
Linux
  • 3 years of experience in a similar SRE, DevOps, or Infrastructure Engineer role
  • Strong experience with at least one major cloud provider (AWS, GCP, or Azure)
  • Hands-on experience with Kubernetes and containerization
  • Proficient in scripting languages such as Python, Bash, or similar for automation
  • Experience with Infrastructure as Code (IaC) tools
  • Strong understanding of load balancers, networking, and HA architecture
  • Experience with CI/CD tools
  • Experience with modern monitoring and observability tools
  • Strong analytical skills and ability to resolve complex technical issues
  • Excellent communication and collaboration skills

Interested in this job?

Jobs Related To Lucidya Site Reliability Engineer (SRE)

Site Reliability Engineer (SRE)

Remote Site Reliability Engineer position at Lucidya, focusing on cloud infrastructure, Kubernetes, and automation with 3 years of experience required.

Site Reliability Engineer II

Microsoft seeks Site Reliability Engineer II for security team, offering hybrid work, competitive pay, and comprehensive benefits. 4+ years experience required.

Site Reliability Engineer II

Microsoft is seeking a Site Reliability Engineer II to join their Secure Admin Services team, focusing on cybersecurity solutions and system reliability.

Software Developer III, Site Reliability Development, Google Cloud

Site Reliability Developer role at Google Cloud focusing on building and maintaining large-scale distributed systems with competitive compensation and benefits.

Software Engineer III, Shopping Build Site Reliability Engineer

Site Reliability Engineer role at Google focusing on Shopping Build infrastructure, requiring distributed systems expertise and 2+ years of software development experience.