Principal Site Reliability Engineer

Global leader in CRM and enterprise cloud computing solutions.
$223,000 - $323,400
Site Reliability
Principal Software Engineer
In-Person
5,000+ Employees
15+ years of experience
Enterprise SaaS · Cloud
This job posting may no longer be active. You may be interested in these related jobs instead:
Principal/Architect- Availability Engineering & SRE

Principal/Architect role leading Salesforce's SRE team, focusing on large-scale distributed systems and service reliability with 15+ years experience required.

VP, Software Engineering, SRE

Lead Salesforce's SRE organization as VP, driving reliability innovation and cultural transformation while managing a global team of 100+ engineers.

Principal Engineer, AI, Trust, Security, Site Reliability Engineering

Lead technical initiatives in AI, Trust, and Security for Google's Site Reliability Engineering organization, architecting next-generation cloud platforms.

Principal/Architect- Availability Engineering & SRE

Principal/Architect role leading Salesforce's SRE team, focusing on large-scale distributed systems and service reliability with 15+ years experience required.

Principal Database Site Reliability Engineer

Principal Database SRE role at Oracle Health, focusing on cloud infrastructure and healthcare applications transformation.

Description For Principal Site Reliability Engineer

Salesforce is seeking a Principal Site Reliability Engineer to join their Availability Engineering teams. This role is crucial in driving 'best in class' availability across their multi-substrate engineering platform that serves tens of millions of users. The position requires deep expertise in large-scale systems and concurrency, with a focus on crafting highly available solutions.

As a Principal SRE, you'll work with delivery teams to implement and maintain resilient applications deployed across thousands of compute nodes in multiple data centers. The role involves championing resiliency best practices, working with various cloud platforms (AWS, GCP, Azure & Alibaba), and contributing to open-source technologies.

The ideal candidate will bring 15+ years of software development experience, with at least 5 years in a leadership role. You'll be responsible for reverse engineering solutions, defining availability improvement projects, and maintaining critical infrastructure services. This position offers the opportunity to work on complex technical challenges while ensuring system reliability for one of the world's leading enterprise software companies.

You'll be part of a specialist unit focused on availability and resilience, where you'll have the chance to influence architectural decisions, mentor team members, and drive innovation in system availability. The role combines technical leadership with hands-on development, requiring both strategic thinking and practical implementation skills.

This is an excellent opportunity for a seasoned engineer who is passionate about system reliability, enjoys solving complex distributed systems challenges, and wants to make a significant impact on a platform that powers businesses worldwide. The position offers competitive compensation and the chance to work with cutting-edge technologies in a collaborative environment.

Last updated 21 days ago

Responsibilities For Principal Site Reliability Engineer

  • Embed with delivery teams in a Lead capacity, focusing on corrective and proactive availability measures
  • Design, develop, debug, and operate resilient applications across distributed systems
  • Champion resiliency best practices including observability tool integration and auto-scaling
  • Develop Infrastructure-as-Code using Terraform
  • Build/integrate with APIs and microservices on containerization frameworks
  • Resolve complex technical issues and drive innovations for system availability
  • Participate in on-call rotation to address complex problems in real-time
  • Balance live runtime management, feature delivery, and retirement of technical debt

Requirements For Principal Site Reliability Engineer

Java
Python
Kubernetes
  • Related technical degree required (masters preferred)
  • 15+ years of hands-on software development experience
  • 5+ years in a Tech Lead, Principal or Architect capacity
  • Mastery of object oriented languages such as Java, Golang, APEX, Python
  • Deep experience with core web technologies: HTTP, JSON, REST, XML
  • Proficiency with databases including Oracle or other relational/NoSQL solutions
  • Experience owning and operating multiple instances of critical services
  • Subject matter expertise on Service ownership best practices
  • Thorough knowledge of Agile development methodology

Interested in this job?