Site Reliability Engineer

Next-generation payments technology company providing cloud-native software to optimize financial transaction processing since 2012.
Site Reliability
Mid-Level Software Engineer
In-Person
3+ years of experience
Finance

Description For Site Reliability Engineer

Electrum Payments is a leading payments technology company that has been delivering enterprise-grade payment solutions since 2012. They specialize in cloud-native software for optimizing financial transaction processing, focusing on high-volume, low-value payment schemes. As a Site Reliability Engineer, you'll be at the forefront of ensuring the reliability and performance of critical payment systems that impact millions of South Africans daily.

The role combines traditional IT operations with software engineering expertise, requiring you to build and maintain robust, scalable systems. You'll work on critical tasks including incident prevention, infrastructure management, monitoring system implementation, and ensuring smooth cloud operations. The position offers significant opportunities for personal growth and career progression within a company that values technical excellence.

Key responsibilities include developing reliable applications, managing critical incidents, implementing monitoring solutions, and driving cost-optimization initiatives. You'll also be involved in disaster recovery planning and system performance optimization. The ideal candidate should have strong technical skills in AWS services, observability tools, and a solid background in SRE practices.

The company offers an excellent work environment with a strong focus on work-life balance. Benefits include flexible working hours, daily cooked lunches, and regular team social activities. Electrum fosters a transparent culture where learning from mistakes is encouraged, making it an ideal place for professional growth and development.

Last updated a month ago

Responsibilities For Site Reliability Engineer

  • Monitor, automate, and improve reliability, scalability, performance and availability of services
  • Collaborate with teams to develop reliable, available, and scalable applications
  • Participate in on-call rotations and manage critical incidents
  • Develop and maintain incident response processes and alerting mechanisms
  • Diagnose and resolve infrastructure and system-level issues
  • Implement automation tools and frameworks for deployment, configuration, and monitoring processes
  • Design and implement disaster recovery strategies
  • Drive cost-optimization initiatives

Requirements For Site Reliability Engineer

Kubernetes
  • Bachelor's degree in Computer Science, Information Technology, or related field
  • 3+ years experience in an SRE or similar role
  • Familiarity with AWS services like EC2, S3, RDS, Lambda, EKS and CloudWatch
  • Experience with observability tools like Elastic and Grafana
  • Development skills advantageous
  • Proficient troubleshooting and problem-solving skills
  • Excellent collaboration, communication, and time management skills
  • Attention to detail and ability to work effectively in a team environment

Benefits For Site Reliability Engineer

  • Flexible core working hours
  • Daily cooked lunches
  • Stocked kitchen
  • Team socializing and getaways
  • Social outings

Interested in this job?

Jobs Related To Electrum Payments Site Reliability Engineer

Cloud Site Reliability Engineer (SRE)

Cloud SRE position at Incorta focusing on infrastructure reliability, automation, and DevOps practices, requiring 2-3 years of experience.

Site Reliability Engineer

Site Reliability Engineer position focused on managing and supporting cloud applications and infrastructure using AWS and Atlassian tools.

Software Engineer, Traffic Trust SRE, DoS Infrastructure

Site Reliability Engineer position at Google focusing on Traffic Trust and DoS Infrastructure, combining software engineering with systems operations to maintain large-scale distributed systems.

Software Engineer III, Site Reliability Engineer

Site Reliability Engineer role at Google focusing on building and maintaining large-scale distributed systems for Google Cloud services.

Databases Site Reliability Engineer

Site Reliability Engineer position at Google focusing on database systems, requiring expertise in distributed systems and infrastructure management.