Salesforce is seeking a Principal Site Reliability Engineer to join their Availability Engineering teams. This role is crucial in driving 'best in class' availability across their multi-substrate engineering platform that serves tens of millions of users. The position requires deep expertise in large-scale systems and concurrency, with a focus on crafting highly available solutions.
As a Principal SRE, you'll work with delivery teams to implement and maintain resilient applications deployed across thousands of compute nodes in multiple data centers. The role involves championing resiliency best practices, working with various cloud platforms (AWS, GCP, Azure & Alibaba), and contributing to open-source technologies.
The ideal candidate will bring 15+ years of software development experience, with at least 5 years in a leadership role. You'll be responsible for reverse engineering solutions, defining availability improvement projects, and maintaining critical infrastructure services. This position offers the opportunity to work on complex technical challenges while ensuring system reliability for one of the world's leading enterprise software companies.
You'll be part of a specialist unit focused on availability and resilience, where you'll have the chance to influence architectural decisions, mentor team members, and drive innovation in system availability. The role combines technical leadership with hands-on development, requiring both strategic thinking and practical implementation skills.
This is an excellent opportunity for a seasoned engineer who is passionate about system reliability, enjoys solving complex distributed systems challenges, and wants to make a significant impact on a platform that powers businesses worldwide. The position offers competitive compensation and the chance to work with cutting-edge technologies in a collaborative environment.