Site Reliability Engineering (SRE) at Salesforce combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. This principal role will shape the technical strategy for SRE and influence the Availability Cloud strategy. The position involves embedding with product teams, defining availability roadmaps, and delivering against them while mentoring other engineers.
The role focuses on enabling service owners to operate at scale through observability frameworks, system optimization, and infrastructure design. You'll tackle complex scaling challenges unique to Salesforce while utilizing expertise in coding, algorithms, and large-scale system design. The SRE team values diversity, intellectual curiosity, and problem-solving in a blame-free environment.
As a Principal Engineer, you'll lead technical initiatives, uncover themes, design solutions, and implement improvements to enhance service reliability. The position requires hands-on coding (minimum 25%) and collaboration with cross-functional teams. You'll need to challenge the status quo, communicate effectively, and influence through data-driven insights.
The ideal candidate brings 15+ years of software development experience, deep expertise in distributed systems, and a track record of leading multi-team initiatives. You should be passionate about mentoring others, have mastery of object-oriented programming, and extensive experience with cloud technologies and service ownership practices.
Join Salesforce to shape the future of enterprise reliability while working with cutting-edge technologies and talented engineers in a collaborative, growth-oriented environment.