Site Reliability Engineer - GovCloud 24x7

Salesforce is a leading cloud-based customer relationship management (CRM) platform.
$114,200 - $157,100
Site Reliability
Senior Software Engineer
In-Person
5,000+ Employees
5+ years of experience
Enterprise SaaS · Cloud · Cybersecurity
This job posting may no longer be active. You may be interested in these related jobs instead:
Site Reliability Engineer

Senior Site Reliability Engineer position at OneDegree, focusing on cloud infrastructure, monitoring, and automation for insurance and cybersecurity platforms in APAC.

Senior Site Reliability Engineer

Senior Site Reliability Engineer role at Prove, focusing on building and maintaining scalable, reliable systems for digital identity solutions.

Senior Site Reliability Engineer

Senior Site Reliability Engineer role at Prove, focusing on building and maintaining scalable, reliable systems for digital identity solutions.

Senior Software Developer, Site Reliability Engineering, Google Cloud

Senior SRE role at Google Cloud focusing on building and maintaining large-scale distributed systems with emphasis on reliability and scalability.

Senior Software Engineer, Site Reliability Engineering, Google Cloud

Senior SRE position at Google Cloud focusing on building and maintaining large-scale distributed systems with emphasis on reliability and automation.

Description For Site Reliability Engineer - GovCloud 24x7

Salesforce is seeking a Site Reliability Engineer for their GovCloud 24x7 team. This role is part of the GovCloud Incident Response (GIR) team, which maintains the current infrastructure with daily alert response, smart hands, and incident management. The ideal candidate must be a U.S. Citizen operating on U.S. Soil with the ability to meet customer and government screening standards.

Key responsibilities include:

  • Maintaining customer-facing services at top performance
  • Managing incidents and participating in technical reviews
  • Conducting problem management and participating in RCAs
  • Ensuring compliance with company policies and directives
  • Collaborating with other technical staff to solve issues
  • Staying updated on industry innovations and technologies

The role requires working on a 24/7 team with rotating day and night shifts and participating in an on-call rotation. Candidates should have expertise in TCP/IP technologies, Unix variants (especially Linux and Solaris), monitoring security systems, and incident management. Experience with AWS/C2S infrastructure, scripting languages, and ITIL service operations is essential.

Preferred qualifications include experience with Chef/Puppet, Jenkins/Bamboo/Spinnaker, Java applications, Kubernetes, and certifications in Linux+, RedHat, and AWS. Familiarity with Agile and DevOps processes, as well as experience in resilience engineering and post-incident investigations, is highly valued.

This challenging role offers the opportunity to work with cutting-edge technologies in a dynamic, high-stakes environment, supporting critical government cloud infrastructure. Join Salesforce's GovCloud team to make a significant impact on the reliability and performance of essential services.

Last updated 2 months ago

Responsibilities For Site Reliability Engineer - GovCloud 24x7

  • Maintain customer-facing services at top performance
  • Manage incidents and participate in technical reviews
  • Conduct problem management and participate in RCAs
  • Ensure compliance with company policies
  • Collaborate with technical staff to solve issues
  • Stay updated on industry innovations
  • Work on a 24/7 team with rotating shifts
  • Participate in on-call rotation

Requirements For Site Reliability Engineer - GovCloud 24x7

Linux
Python
Go
  • U.S. Citizenship
  • Ability to meet government screening standards
  • Systems engineering experience in enterprise scale internet service
  • Expertise in TCP/IP technologies
  • Expertise in Unix variants (Linux/Solaris/BSD)
  • Strong understanding of monitoring security systems
  • Strong communication skills
  • Experience in Incident Management and ITIL service operations
  • Experience with AWS/C2S infrastructure
  • Scripting skills in Python, Go, or other languages

Interested in this job?