Principal Site Reliability Engineer

Operator of the world's largest security cloud, protecting enterprise customers through the AI-powered Zero Trust Exchange platform.
$161,000 - $230,000
Site Reliability
Principal Software Engineer
Hybrid
1,000 - 5,000 Employees
8+ years of experience
Cybersecurity · Enterprise SaaS

Description For Principal Site Reliability Engineer

Zscaler, the world's largest cloud security platform operator, is seeking a Principal Site Reliability Engineer to join their Engineering Team. This role offers an exciting opportunity to work with cutting-edge cloud security technology that serves thousands of enterprise customers, including 40% of Fortune 500 companies. The position involves working with large-scale distributed systems across major cloud platforms (AWS, GCP, Azure) and requires expertise in infrastructure as code.

The successful candidate will be part of a team that has secured over 100 patents and continues to innovate in cloud security. You'll be responsible for supporting mission-critical production services, developing automation tools, and ensuring high system reliability. The role requires strong problem-solving skills and the ability to work in high-pressure situations.

This hybrid position, based in San Jose, CA, offers a competitive salary range of $161,000 - $230,000 USD, along with comprehensive benefits including health plans, retirement options, and education reimbursement. The company promotes a diverse and inclusive culture, having been named a Best Workplace in Technology by Fortune.

Working at Zscaler means joining a team of cloud architects, software engineers, and security experts who are enabling organizations worldwide to embrace cloud-first strategies. The company's multitenant architecture serves more than 15 million users across 185 countries, making it an ideal environment for those passionate about large-scale system reliability and security.

The role requires 8-10+ years of relevant SRE experience, deep understanding of SRE principles, and expertise in cloud platforms. Knowledge of Python, Golang, Java, or Rust, along with Kubernetes experience, will make candidates stand out. This is an excellent opportunity for someone who thrives in a fast-paced, collaborative environment and wants to contribute to building and innovating for the greater good.

Last updated 6 days ago

Responsibilities For Principal Site Reliability Engineer

  • Work with large scale distributed systems, cloud platforms (AWS, GCP, Azure) and infrastructure as code
  • Support large-scale services, manage high-pressure situations, and participate in on-call rotations
  • Develop and enhance tools for large scale services technologies
  • Diagnose and fix issues by editing code, modifying infrastructure configurations
  • Develop automation tools and optimize services through version-controlled infrastructure-as-code

Requirements For Principal Site Reliability Engineer

Python
Go
Java
Rust
Kubernetes
  • 8-10+ years of relevant experience working in SRE teams, supporting mission critical production service
  • Deep understanding of SRE principles, practices, and tools
  • Experience with large scale distributed systems, cloud platforms (AWS, GCP, Azure) and infrastructure as code
  • Experience with incident response including resolving system failures and outages
  • Bachelor's Degree in Computer Science, Management Information Systems, or equivalent experience
  • Proficiency in Python, Golang, Java, or Rust
  • Experience with Kubernetes on multiple cloud provider platforms

Benefits For Principal Site Reliability Engineer

Medical Insurance
Dental Insurance
Vision Insurance
Parental Leave
401k
Education Budget
  • Various health plans
  • Time off plans for vacation and sick time
  • Parental leave options
  • Retirement options
  • Education reimbursement
  • In-office perks

Interested in this job?

Jobs Related To Zscaler Principal Site Reliability Engineer

Principal Site Reliability Engineer

Principal SRE role at Zscaler, leading cloud security platform, working with distributed systems and cloud infrastructure in San Jose, CA.

Systems Engineering Principal

Principal Engineer role leading reliability engineering and post-incident analysis at Salesforce, driving systemic improvements across cloud platforms.

Principal Site Reliability Engineer

Principal SRE role at Zscaler, leading cloud security platform, working with distributed systems and cloud infrastructure in San Jose, CA.

Principal Engineer, AI, Trust, Security, Site Reliability Engineering

Lead technical initiatives in AI, Trust, and Security for Google's Site Reliability Engineering organization, architecting and implementing large-scale distributed systems.

Principal/Architect- Software Engineering - Availability

Principal Software Engineer role at Salesforce focusing on Site Reliability Engineering, building and maintaining large-scale distributed systems.