Principal Site Reliability Engineer

A global technology company empowering every person and organization on the planet to achieve more through innovative software and cloud solutions.
Burlington, MA, USA
$137,600 - $267,000
Site Reliability
Principal Software Engineer
Remote
5,000+ Employees
8+ years of experience
Healthcare · Enterprise SaaS

Description For Principal Site Reliability Engineer

Microsoft's Health & Life Sciences Solutions organization is seeking a Principal Site Reliability Engineer to join their interdisciplinary team working on next-generation healthcare solutions. This role focuses on ensuring the reliability and performance of database systems, particularly Azure SQL and CosmosDB, while supporting healthcare-oriented copilots. The position offers a competitive base salary range of $137,600 - $267,000 (higher in SF and NYC areas) and requires 8+ years of technical experience. The role combines database expertise with healthcare innovation, working in a collaborative environment to design and maintain robust, scalable database solutions. The successful candidate will be responsible for performance optimization, disaster recovery, security implementation, and mentoring team members while contributing to Microsoft's mission of empowering global healthcare outcomes. The position includes comprehensive benefits and supports up to 100% remote work with 0-25% travel requirements.

Last updated 2 months ago

Responsibilities For Principal Site Reliability Engineer

  • Design, deploy, and manage highly available, reliable, and scalable database architectures
  • Monitor and optimize database performance
  • Develop and implement database backup and disaster recovery strategies
  • Perform database capacity planning and resource utilization analysis
  • Collaborate with development teams on optimization
  • Troubleshoot and resolve database-related incidents
  • Implement security and access control measures
  • Create and implement monitoring and alerting solutions
  • Participate in on-call rotation
  • Lead and mentor other members of the Site Reliability Engineering team
  • Find efficiencies in data handling processes
  • Coordinate with analytics teams on reporting and data warehousing

Requirements For Principal Site Reliability Engineer

  • 8+ years technical experience in software engineering, network engineering, or systems administration
  • 8+ Years experience with database technologies
  • 8+ years experience with performance optimization
  • Bachelor's/Master's/Doctorate Degree in Computer Science, Information Technology, or related field
  • Must pass Microsoft Cloud Background Check
  • The ability to collaborate across many teams and applications

Benefits For Principal Site Reliability Engineer

Medical Insurance
Education Budget
Parental Leave
  • Industry leading healthcare
  • Educational resources
  • Discounts on products and services
  • Savings and investments
  • Maternity and paternity leave
  • Generous time away
  • Giving programs
  • Opportunities to network and connect

Interested in this job?

Jobs Related To Microsoft Principal Site Reliability Engineer

Principal Site Reliability Engineer

Principal SRE position at Microsoft Azure focusing on customer experience, SLO implementation, and observability solutions with remote work options.

Systems Engineering Principal

Principal Engineer role leading reliability engineering and post-incident analysis at Salesforce, driving systemic improvements across cloud platforms.

Principal Site Reliability Engineer

Principal SRE role at Zscaler, leading cloud security platform, working with distributed systems and cloud infrastructure in San Jose, CA.

Principal Site Reliability Engineer

Principal SRE position at Zscaler, working with large-scale cloud security platforms, requiring 8+ years experience, offering $161-230K salary with comprehensive benefits.

Principal Engineer, AI, Trust, Security, Site Reliability Engineering

Lead technical initiatives in AI, Trust, and Security for Google's Site Reliability Engineering organization, architecting and implementing large-scale distributed systems.