Site Reliability Engineering II

Microsoft builds cloud and enterprise software, focusing on empowering people and organizations globally through technology.
$98,300 - $193,200
Site Reliability
Mid-Level Software Engineer
Remote
5,000+ Employees
4+ years of experience
Enterprise SaaS · AI · Cloud

Description For Site Reliability Engineering II

Microsoft's Azure Data engineering team is seeking a Site Reliability Engineer II to join their databases team, focusing on operational Database systems. This role is part of the Azure Cosmos DB team, Microsoft's globally distributed, massively scalable, multi-model cloud database service. The position offers up to 100% remote work with 0-25% travel requirements.

The ideal candidate will focus on building and optimizing solutions for analyzing massive amounts of telemetry and service health indicators in real-time, performing automated root cause analysis, and implementing necessary mitigations to maintain Service Level Objectives (SLOs). The role requires expertise in large-scale cloud services, with emphasis on improving service reliability, availability, and performance.

The position offers competitive compensation ranging from $98,300 to $193,200 per year (higher in SF and NYC areas), along with comprehensive benefits including healthcare, educational resources, and parental leave. This is an excellent opportunity to work with cutting-edge technology in a team that operates like a startup while being part of a major tech company.

The role combines technical expertise with customer interaction, requiring both strong engineering skills and the ability to communicate effectively with enterprise customers. You'll be working on critical systems that serve various industries including Healthcare, Retail, Telecommunications, and IoT, where service availability and latency are paramount.

Microsoft values diversity and encourages applications from candidates with different experiences and perspectives. The company's mission is to empower every person and organization on the planet to achieve more, making this an excellent opportunity for those passionate about making a global impact through technology.

Last updated 2 days ago

Responsibilities For Site Reliability Engineering II

  • Collaborating with engineering teams on building and enhancing tooling and automation solutions
  • Working with customers to understand pain points around Supportability and SLO attainment
  • Designing and implementing service telemetry changes
  • Enhancing customer facing experience through proactive alerting
  • Analyzing data and providing operational insights to Design and Product teams
  • Interface with large enterprise customers for handling service escalations

Requirements For Site Reliability Engineering II

Kubernetes
Python
  • 4+ years technical experience in software engineering, network engineering, or systems administration
  • 3+ years of SRE or SWE experience running large scale cloud services
  • 2+ years of operational experience in improving Service Reliability, Availability and Performance
  • Understanding of Observability and MELT implementation patterns
  • Experience in Logic Apps and authoring Jupyter Notebooks
  • Experience in analyzing and troubleshooting distributed systems
  • Bachelor's Degree in Computer Science, Information Technology, or related field
  • Must pass Microsoft Cloud Background Check

Benefits For Site Reliability Engineering II

Medical Insurance
Education Budget
Parental Leave
Mental Health Assistance
  • Industry leading healthcare
  • Educational resources
  • Discounts on products and services
  • Savings and investments
  • Maternity and paternity leave
  • Generous time away
  • Giving programs
  • Networking opportunities

Interested in this job?

Jobs Related To Microsoft Site Reliability Engineering II

Site Reliability Engineer II

Microsoft seeks Site Reliability Engineer II for security team, offering hybrid work, competitive pay, and comprehensive benefits. 4+ years experience required.

Site Reliability Engineer II

Microsoft is seeking a Site Reliability Engineer II to join their Secure Admin Services team, focusing on cybersecurity solutions and system reliability.

Site Reliability Engineer

Microsoft seeks a Site Reliability Engineer to secure cloud infrastructure for government clients, offering hybrid work and competitive benefits.

Site Reliability Engineer II

Microsoft seeks Site Reliability Engineer II for Azure Data team to maintain and optimize cloud platform reliability, offering hybrid work and competitive benefits.

Site Reliability Engineer II

Microsoft seeks Site Reliability Engineer II for Sovereign Cloud Security team to lead incident response, system reliability, and security infrastructure improvements.