Principal/Architect- Software Engineering - Availability

A leading cloud-based software company providing customer relationship management and enterprise solutions.
$211,500 - $384,100
Site Reliability
Principal Software Engineer
In-Person
5,000+ Employees
15+ years of experience
Enterprise SaaS · Cloud

Description For Principal/Architect- Software Engineering - Availability

Site Reliability Engineering (SRE) at Salesforce combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. This principal role will shape the technical strategy for SRE and influence the Availability Cloud strategy. The position involves embedding with product teams, defining availability roadmaps, and delivering against them while maturing the SRE practice.

The role focuses on enabling service owners to operate at scale through observability frameworks, system optimization, and infrastructure design. You'll tackle complex scaling challenges unique to Salesforce while applying expertise in coding, algorithms, and large-scale system design. The position requires hands-on coding (at least 25%) and technical leadership.

SRE at Salesforce promotes a culture of diversity, intellectual curiosity, and problem-solving in a blame-free environment. The team brings together people with varied backgrounds and perspectives, encouraging collaboration and innovation. You'll work on meaningful projects with the support and mentorship needed to learn and grow.

As a Principal/Architect, you'll lead technical initiatives, uncover themes, design solutions, and implement improvements to enhance service reliability. The role requires strong communication skills, cross-organizational influence, and the ability to collaborate across technical and business boundaries. Success is measured by scaling the impact and delivery of your community.

This is an opportunity to join a leading enterprise software company, work with cutting-edge technologies, and shape the future of reliability engineering at scale. The position offers competitive compensation and the chance to work on systems that impact millions of users globally.

Last updated 4 days ago

Responsibilities For Principal/Architect- Software Engineering - Availability

  • Spearhead and enable the culture of Service Ownership
  • Engage in and improve the whole lifecycle of services
  • Support services before they go live through system design consulting
  • Develop full paved path observability platform integrations
  • Scale systems sustainably through automation
  • Practice sustainable incident response and blameless post mortems
  • Develop and grow the engineering talent

Requirements For Principal/Architect- Software Engineering - Availability

Java
Python
Kubernetes
Go
  • 15+ years of software development and engineering experience
  • Experience designing, building and operating large scale distributed systems
  • Experience leading initiatives spanning multiple teams
  • Ability to effectively collaborate across multiple teams
  • Experience mentoring and developing engineers
  • Mastery of object oriented languages (Java, Golang, Python, C++, C)
  • Experience in Kubernetes, Istio, Public Cloud (AWS)
  • Deep experience with core web technologies: HTTP, JSON, REST, XML
  • Experience owning and operating critical services
  • Expertise in Service ownership best practices, SLO/I/A definition
  • Knowledge of Agile development methodology
  • Experience in fault modeling, chaos engineering, and load testing

Interested in this job?

Jobs Related To Salesforce Principal/Architect- Software Engineering - Availability

VP, Software Engineering, SRE

Lead Salesforce's global SRE organization, driving reliability strategy and transformation while managing a 100+ person team.

Software Engineering Reliability PMTS

Principal Software Engineer role focusing on Site/Product Reliability Engineering for Salesforce's AgentForce platform, specializing in AI and production support.

Principal/Architect- Availability Engineering & SRE

Principal/Architect role leading Salesforce's Site Reliability Engineering team, focusing on large-scale distributed systems and technical strategy.

VP, Software Engineering, SRE

Lead Salesforce's global SRE organization, driving reliability strategy and transformation while managing a 100+ person team.

Engineering Director, P2020 Rollouts

Lead Google's Rollouts platform development, managing continuous deployment solutions for Alphabet's services as Engineering Director in Dublin.