Manager Site Reliability Engineer

Next-Gen Banking Tech company empowering banks and fintechs to launch banking products with cloud-native processing platform Zeta Tachyon.
Site Reliability
Staff Software Engineer
In-Person
1,000 - 5,000 Employees
10+ years of experience
Finance · Enterprise SaaS

Description For Manager Site Reliability Engineer

Zeta, a pioneering Next-Gen Banking Tech company valued at $1.5 billion, is seeking a Manager Site Reliability Engineer to join our innovative team. Founded in 2015, we've revolutionized banking technology with our cloud-native Zeta Tachyon platform, successfully processing over 20M+ cards globally.

As our Manager SRE, you'll play a crucial role in bridging development and operations, ensuring system reliability and scalability. You'll lead a team of SREs, implementing best practices in automation, monitoring, and infrastructure management. The position offers the opportunity to work with cutting-edge technologies including Kubernetes, cloud platforms, and modern DevOps tools.

Our ideal candidate brings 10-15 years of SRE experience and strong technical expertise in programming, cloud computing, and infrastructure as code. You'll be part of a company with 1700+ employees (70% in R&D) across US, EMEA, and Asia, backed by major investors like Softbank and Mastercard.

This role combines technical leadership with team management, offering the chance to shape the reliability and scalability of systems that transform banking experiences. You'll work in our Hyderabad office, contributing to a culture of automation and continuous improvement while mentoring team members and driving technical excellence.

Join us in revolutionizing banking technology while working with a diverse, inclusive team that values innovation and technical expertise. This is an excellent opportunity for an experienced SRE leader looking to make a significant impact in the fintech industry.

Last updated 22 days ago

Responsibilities For Manager Site Reliability Engineer

  • Ensure reliability of software systems through scalable infrastructure
  • Develop automation tools and scripts for operational tasks
  • Monitor system performance and respond to incidents
  • Conduct capacity planning and usage pattern analysis
  • Implement and maintain monitoring and logging solutions
  • Lead and motivate a team of SREs
  • Provide mentorship and coaching to team members
  • Implement security best practices in infrastructure
  • Develop and maintain disaster recovery plans
  • Drive continuous improvement initiatives

Requirements For Manager Site Reliability Engineer

Python
Go
Kubernetes
  • 10-15 years of experience in site reliability engineering
  • B.Tech/M.Tech in computer science, information technology or related field
  • Proficiency in Python, Go, Shell, Bash
  • Experience with Docker and Kubernetes
  • Proficiency in cloud platforms (AWS, Azure, or Google Cloud)
  • Knowledge of Infrastructure as Code tools like Terraform
  • Experience with monitoring tools (Prometheus, Grafana, ELK stack)
  • Understanding of networking concepts and protocols
  • Proficient in version control systems like Git
  • Experience in CI/CD implementation

Interested in this job?

Jobs Related To Zeta Manager Site Reliability Engineer

Site Reliability Developer 4

Staff Site Reliability Engineer position at Oracle focusing on cloud infrastructure, automation, and service reliability with 7+ years experience required.

Staff Software Engineer, Reliability Engineering

Staff Software Engineer position at Airbnb focusing on Site Reliability Engineering, developing and maintaining tools for service reliability at scale.

Sr Staff Software Engineer, Reliability Engineering

Senior Staff SRE position at Airbnb focusing on reliability architecture, incident management, and technical leadership, offering competitive compensation and remote work flexibility.

Site Reliability Developer 4

Staff-level Site Reliability Engineer role at Oracle focusing on cloud infrastructure, automation, and distributed systems with 6-10+ years experience required.

Site Reliability Manager, Core Enterprise Systems

Lead a team of Site Reliability Engineers at Google, managing enterprise services and driving technical innovation in system reliability and automation.