Lucidya is seeking a Site Reliability Engineer (SRE) to join their Cloud Engineering team. This role focuses on enhancing the reliability, scalability, and automation of cloud-based infrastructure. The ideal candidate will bring hands-on experience with cloud environments, containerized workloads, and monitoring systems.
The position involves managing critical infrastructure components, ensuring high availability, and implementing automation across cloud platforms like AWS, GCP, or Azure. Key responsibilities include Kubernetes cluster management, monitoring implementation using tools like Datadog or Grafana, and developing automation scripts for operational efficiency.
The role requires 3 years of experience in SRE/DevOps, strong cloud platform expertise, and proficiency in technologies like Kubernetes, Python, and Infrastructure as Code tools. The successful candidate will participate in on-call rotations, collaborate with various teams, and contribute to establishing best practices for infrastructure reliability.
This remote position offers the opportunity to work with modern cloud technologies while making a significant impact on system reliability and performance. The role combines technical expertise with problem-solving skills, making it ideal for engineers passionate about infrastructure automation and system reliability.