Senior Site Reliability Engineer

Microsoft builds the data platform for the age of AI, powering data-first applications and driving a data culture through Azure Data engineering team.
$108,100 - $199,700
Site Reliability
Senior Software Engineer
Hybrid
5,000+ Employees
6+ years of experience
AI · Enterprise SaaS · Cloud

Description For Senior Site Reliability Engineer

Microsoft's Azure Data engineering team is seeking a Senior Site Reliability Engineer to join their databases team, specifically working on Azure Cosmos DB. This role is crucial in maintaining Microsoft's operational Database systems, focusing on developer-friendly, mission-critical, AI-enabled operational Databases. The position involves working with a globally distributed, massively scalable, multi-model cloud database service designed for planet-scale applications.

The ideal candidate will be responsible for building and optimizing solutions that analyze massive amounts of telemetry and service health indicators in near real-time, performing automated root cause analysis, and implementing necessary mitigations to maintain strict Service Level Objectives (SLOs). The role requires collaboration with engineering teams, customer interaction, and a data-driven approach to problem-solving.

Working in Vancouver with a hybrid work arrangement (up to 50% work from home), you'll be part of a team that operates like a startup while having the resources and impact of Microsoft. The position offers competitive compensation (CAD $108,100 - $199,700) and comprehensive benefits, including healthcare, educational resources, and parental leave.

This is an excellent opportunity for experienced engineers who are passionate about service reliability, automation, and working with large-scale distributed systems. The role combines technical expertise with customer interaction, making it ideal for those who enjoy both deep technical work and collaborative problem-solving.

Last updated 20 hours ago

Responsibilities For Senior Site Reliability Engineer

  • Collaborate with engineering teams on building and enhancing tooling and automation solutions
  • Work with customers to understand pain points around Supportability and SLO attainment
  • Design and implement changes to service telemetry
  • Enhance customer facing experience through proactive alerting
  • Analyze data and provide operational insights to Design and Product teams

Requirements For Senior Site Reliability Engineer

Python
Java
  • 6+ years technical experience in software engineering, network engineering, or systems administration
  • Bachelor's/Master's Degree in Computer Science, Information Technology, or related field
  • Understanding of Observability and MELT implementation patterns
  • Experience in Logic Apps and Jupyter Notebooks
  • 5+ years of hands-on experience in Python/Java/C#
  • 3+ years of operational experience in improving Service Reliability
  • Systematic problem-solving approach with effective communication skills

Benefits For Senior Site Reliability Engineer

Medical Insurance
Education Budget
Parental Leave
  • Industry leading healthcare
  • Educational resources
  • Discounts on products and services
  • Savings and investments
  • Maternity and paternity leave
  • Generous time away
  • Giving programs
  • Opportunities to network and connect

Interested in this job?

Jobs Related To Microsoft Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior SRE position at Microsoft Security's Red Team, focusing on building and managing secure infrastructure for offensive security operations.

Senior Site Reliability Engineer

Senior Site Reliability Engineer role at Microsoft's COSMIC team, building and maintaining global scale Kubernetes-based cloud infrastructure with hybrid work option.

Senior Site Reliability Engineer

Senior Site Reliability Engineer role at Microsoft focusing on identity and security platform operations with emphasis on Azure AD and IAM technologies.

Site Reliability Engineer

Senior Site Reliability Engineer role at Microsoft Azure, focusing on customer reliability engineering and cloud infrastructure improvements with hybrid work options in Sydney.

Senior Site Reliability Engineer

Senior Site Reliability Engineer role at Microsoft focusing on cloud infrastructure health and datacenter operations.