Lead Site Reliability Engineer (Observability)

Xero helps supercharge businesses by automating routine tasks, providing actionable insights and connecting businesses with data, advisors and apps.
Melbourne VIC, AustraliaSydney NSW, AustraliaBrisbane QLD, Australia
$150,000 - $220,000
Site Reliability
Staff Software Engineer
Hybrid
1,000 - 5,000 Employees
8+ years of experience
Enterprise SaaS

Description For Lead Site Reliability Engineer (Observability)

Xero, a company dedicated to empowering businesses through automation and data insights, is seeking a Lead Site Reliability Engineer to drive their observability strategy. This role combines hands-on technical leadership with strategic influence, focusing on implementing sophisticated monitoring and remediation toolsets. The position is part of the global Site Reliability Engineering team, spanning across New Zealand, Australia, and the USA.

The ideal candidate will take ownership of shaping observability at Xero, leading the adoption of OpenTelemetry and modern solutions. This role requires deep expertise in reliability and observability concepts, with experience in implementing these in large, distributed cloud environments. The position demands proficiency in programming languages like C#, JavaScript, Golang, or Python, along with extensive experience with various monitoring and logging tools.

Working in a hybrid environment across multiple Australian locations, you'll collaborate with Product Managers, Team Leads, and Principal Engineers to align team efforts with broader SRE and company goals. The role offers excellent benefits, including generous paid leave, comprehensive health benefits, and opportunities for career growth. This is an exceptional opportunity for a technical leader who wants to make a lasting impact on system reliability and performance at scale.

Last updated 33 minutes ago

Responsibilities For Lead Site Reliability Engineer (Observability)

  • Design and implement observability solutions to enhance engineering practices
  • Guide technical design and ensure adherence to architectural principles
  • Identify and address failure patterns to enhance system reliability
  • Define and evolve observability and reliability standards
  • Promote automation, agile, DevOps, and CI/CD methodologies
  • Participate in hiring and recruitment
  • Create an inclusive and collaborative environment

Requirements For Lead Site Reliability Engineer (Observability)

Python
JavaScript
Go
  • Deep knowledge of reliability and observability concepts
  • Experience implementing observability in large, distributed cloud environments (AWS)
  • Experience with monitoring tools like Prometheus, VictoriaMetrics, Jaeger, New Relic, Datadog
  • Proficiency in programming languages like C#, JavaScript, Golang, or Python
  • Experience in on-call rotations and resolving production incidents
  • Experience in agile software development environments
  • Strong stakeholder engagement and influence skills
  • Experience managing observability platforms

Benefits For Lead Site Reliability Engineer (Observability)

Medical Insurance
Dental Insurance
Vision Insurance
Mental Health Assistance
Parental Leave
Equity
  • Generous paid leave
  • Dedicated physical and mental wellbeing leave
  • Employee Assistance Program
  • Health insurance
  • Life insurance
  • Income protection
  • Wellbeing and sports programmes
  • 26 weeks paid parental leave for primary caregivers
  • Employee Share Plan
  • Flexible working
  • Career development

Interested in this job?

Jobs Related To Xero Lead Site Reliability Engineer (Observability)

Lead Engineer, Site Reliability Engineering - Observability

Lead SRE position at Xero focusing on observability systems, technical leadership, and engineering mentorship with comprehensive benefits and hybrid work model.

Lead Engineer, Product Site Reliability Engineer

Lead Engineer position for Product Site Reliability Engineering at Xero, focusing on building and leading SRE teams to ensure system reliability and observability.

Lead Site Reliability Engineer (Product SRE)

Lead Site Reliability Engineer position at Xero, focusing on technical leadership and reliability engineering for product teams.

Staff Software Engineer, Reliability Engineering

Staff Software Engineer position at Airbnb focusing on Site Reliability Engineering, incident management, and building scalable systems with competitive compensation and remote work options.

Sr Staff Software Engineer, Reliability Engineering

Senior Staff SRE position at Airbnb focusing on building and scaling reliable systems, leading technical strategy, and mentoring teams while working remotely.