Principal/Architect- Software Engineering - Availability

A leading cloud-based software company providing customer relationship management and enterprise solutions.
$211,500 - $384,100
Site Reliability
Principal Software Engineer
In-Person
5,000+ Employees
15+ years of experience
Enterprise SaaS · AI

Description For Principal/Architect- Software Engineering - Availability

Salesforce is seeking a Principal/Architect Software Engineer for their Site Reliability Engineering (SRE) team. This role combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. The position is crucial in ensuring Salesforce services maintain reliability, capacity, performance, and availability to meet customer needs.

The role involves leading technical strategy for SRE and influencing the Availability Cloud's direction. You'll work directly with product teams, define availability roadmaps, and deliver against them. The position requires both technical excellence and leadership skills, as you'll be actively developing and mentoring engineers while scaling the impact of your community.

Key focus areas include software development for service operations at scale, observability framework integrations, system optimization, and infrastructure design. You'll tackle complex challenges unique to Salesforce's scale while applying expertise in coding, algorithms, and large-scale system design.

The SRE practice at Salesforce values diversity, intellectual curiosity, and problem-solving in a blame-free environment. You'll collaborate with professionals from various backgrounds, taking calculated risks and working on meaningful projects while receiving support and mentorship for continuous growth.

This role is perfect for someone who isn't afraid to challenge the status quo, can communicate effectively, and can influence cross-organizational initiatives through data-driven insights. You'll be hands-on with code while leading technical initiatives that improve service reliability for Salesforce's customers.

Last updated 6 days ago

Responsibilities For Principal/Architect- Software Engineering - Availability

  • Spearhead and enable the culture of Service Ownership
  • Engage in and improve the whole lifecycle of services
  • Support services before they go live through system design consulting
  • Develop full paved path observability platform integrations
  • Scale systems sustainably through automation
  • Practice sustainable incident response and blameless post mortems
  • Hands on coding at least 25%
  • Develop and grow the engineering talent

Requirements For Principal/Architect- Software Engineering - Availability

Java
Python
Kubernetes
Go
  • 15+ years of software development and engineering experience
  • Experience designing, building and operating large scale distributed systems
  • Experience leading initiatives spanning multiple teams
  • Ability to effectively collaborate across multiple teams
  • Experience mentoring and developing engineers
  • Mastery of object oriented languages (Java, Golang, Python, C++, C)
  • Experience in Kubernetes, Istio, Public Cloud (AWS)
  • Deep experience with core web technologies
  • Experience owning and operating critical services
  • Expertise in Service ownership best practices
  • Knowledge of Agile development methodology
  • Experience in fault modeling and chaos engineering

Interested in this job?

Jobs Related To Salesforce Principal/Architect- Software Engineering - Availability

Principal Software Engineering - Availability

Principal Software Engineering role at Salesforce focusing on Site Reliability Engineering, building and maintaining large-scale distributed systems with 15+ years of experience required.

Director, Software Engineering, Site Reliability

Lead LinkedIn's Site Reliability Engineering team of 40+ engineers, driving infrastructure reliability and automation while ensuring system scalability and performance.

Principal Engineer, AI, Trust, Security, Site Reliability Engineering

Principal Engineer position at Google focusing on AI, security, and site reliability engineering, leading technical initiatives for cloud platform infrastructure.

Director, Software Engineering, Site Reliability

Lead LinkedIn's Site Reliability Engineering team of 40+ engineers, driving infrastructure reliability and innovation for the world's largest professional network.

Principal Site Reliability Development Engineer

Principal SRE role at Oracle Cloud Infrastructure focusing on sovereign cloud operations and automation for government systems in Singapore.