Site Reliability Engineer

Microsoft is a global technology company that develops and provides cloud computing services, software, and hardware.
Site Reliability
Senior Software Engineer
Hybrid
5,000+ Employees
5+ years of experience
Enterprise SaaS · Cloud

Description For Site Reliability Engineer

Join Microsoft's Azure Customer Experience (CXP) Customer Reliability Engineering (CRE) Team, a crucial part of Azure Engineering leading world-class customer reliability initiatives. As a Site Reliability Engineer, you'll be responsible for improving customer experience on Azure, focusing on availability, reliability, resiliency, and uptime at scale.

The role involves working directly with customers, support teams, and engineering to diagnose and troubleshoot mission-critical applications built on the Microsoft Azure platform. You'll be part of a team that continuously listens to customers and drives enhancements across services, support programs, and incident response.

The position requires participation in an on-call rotation and collaboration with various teams to implement improvements based on customer feedback. You'll be responsible for analyzing signals, driving root cause analyses, and implementing service improvements. The ideal candidate should have strong technical expertise in cloud platforms, automation skills, and excellent communication abilities.

Working at Microsoft offers comprehensive benefits including industry-leading healthcare, educational resources, parental leave, and opportunities for professional growth. The role offers a hybrid work arrangement with up to 50% work from home flexibility and involves 0-25% travel.

This is an excellent opportunity for a customer-focused reliability engineer who is passionate about cloud infrastructure and wants to make a significant impact on one of the world's leading cloud platforms. The position requires a proven track record of customer empathy, an engineering mindset, and technical excellence in site reliability engineering.

Last updated 21 hours ago

Responsibilities For Site Reliability Engineer

  • Participate in on-call coverage rotation (approximately 15% of the time) for platform communications and security
  • Collaborate with engineering and product management teams to drive product improvements
  • Analyze signals and drive root cause analyses (RCAs) and service improvements
  • Drive continuous improvement in the Azure platform
  • Identify and drive requirements for customer resiliency and platform reliability
  • Implement customer-centric mitigation strategies and playbooks
  • Participate in next-generation architecture design for cloud infrastructure services
  • Develop key partnerships

Requirements For Site Reliability Engineer

Linux
  • Service engineering experience in 24/7/365 enterprise environment
  • Fluency in automation languages (e.g., PowerShell, CLI)
  • Strong communication skills
  • Understanding of high availability, disaster recovery, and business continuity
  • Strategic thinking and analytical skills
  • Problem resolution and decision-making skills
  • Knowledge of Windows platform or Linux
  • BS/BA in computer science, engineering, mathematics, or equivalent experience
  • Pass Microsoft Cloud Background Check

Benefits For Site Reliability Engineer

Medical Insurance
Education Budget
Parental Leave
  • Industry leading healthcare
  • Educational resources
  • Discounts on products and services
  • Savings and investments
  • Maternity and paternity leave
  • Generous time away
  • Giving programs
  • Opportunities to network and connect

Interested in this job?

Jobs Related To Microsoft Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer role at Microsoft focusing on cloud infrastructure health and datacenter operations.

Senior Site Reliability Engineer

Senior Site Reliability Engineer role at Microsoft, focusing on Azure Cosmos DB service reliability and automation, offering hybrid work and competitive benefits.

Senior Site Reliability Engineer

Senior Site Reliability Engineer role at Microsoft Security, focusing on Identity and Access Management systems, offering competitive pay and remote work options.

Senior Site Reliability Engineer - CTJ - POLY

Senior SRE position at Microsoft working on Azure SQL services for government clouds, requiring security clearance and distributed systems expertise.

Senior Site Reliability Engineer

Senior Site Reliability Engineer role at Microsoft Security, focusing on identity and access management platforms with hybrid work model in Hyderabad.