Staff Site Reliability Engineer

Acquia empowers the world's most ambitious brands to create digital customer experiences that matter. With open source Drupal at its core, the Acquia Digital Experience Platform (DXP) enables marketers, developers, and IT operations teams at thousands of global organizations to rapidly compose and deploy digital products and services that engage customers, enhance conversions, and help businesses stand out.
San José Province, San José, Costa Rica
DevOps
Staff Software Engineer
Remote
1,000 - 5,000 Employees
8+ years of experience
Enterprise SaaS

Description For Staff Site Reliability Engineer

Acquia is seeking a Staff Site Reliability Engineer to play a key role in designing, implementing, and maintaining CI/CD pipelines, cloud infrastructure, and monitoring solutions. This hands-on position requires expertise in tools like ArgoCD, Kubernetes, and cloud-native architecture to achieve operational excellence at scale. The ideal candidate will work closely with engineering teams to ensure rapid, safe, and reliable deployments.

Key responsibilities include:

  • Mastering CI/CD pipelines using tools like ArgoCD and Jenkins
  • Building and managing scalable infrastructure with Terraform and Kubernetes
  • Architecting cloud environments (AWS, GCP, or Azure) for optimal performance and cost
  • Implementing comprehensive monitoring solutions with Prometheus, Grafana, ELK, and Datadog
  • Championing DevOps culture and best practices across teams
  • Focusing on building resilient systems and implementing Service Level Objectives (SLOs)
  • Collaborating with security teams to implement robust security practices
  • Working closely with product development teams to integrate CI/CD practices

Required skills:

  • BS in Computer Science or equivalent experience
  • Proficiency in languages like Go, Python, Ruby, PHP, Java, or JavaScript
  • Strong Unix/Linux administration skills
  • Expertise in CI/CD tools, Kubernetes, cloud platforms, and Infrastructure as Code
  • Experience with monitoring and observability tools
  • Security-focused mindset and excellent problem-solving abilities

Preferred qualifications:

  • 8-13 years of hands-on DevOps or SRE experience
  • Deep knowledge of ArgoCD or similar tools
  • Strong scripting skills in Python, Go, or Bash
  • Experience with service mesh architectures
  • SRE Certification and Certified Kubernetes Administrator (CKA) are a plus

Join Acquia, a global leader in digital experience platforms, and be part of building the future of digital customer experiences.

Last updated 2 months ago

Responsibilities For Staff Site Reliability Engineer

  • Design, build, and optimize CI/CD pipelines
  • Build and manage scalable infrastructure using IaC tools
  • Architect and manage cloud environments
  • Implement comprehensive monitoring and alerting solutions
  • Champion DevOps culture and best practices
  • Focus on building resilient systems and implementing SLOs
  • Collaborate with security teams on infrastructure security
  • Work closely with product development teams

Requirements For Staff Site Reliability Engineer

Go
Java
JavaScript
Kubernetes
Linux
MongoDB
MySQL
Node.js
PHP
Python
PostgreSQL
Redis
Ruby
  • BS in Computer Science or equivalent experience
  • Proficiency in Go, Python, Ruby, PHP, Java, or JavaScript
  • Strong Unix/Linux administration skills
  • Expertise in CI/CD tools (ArgoCD, Jenkins, etc.)
  • Kubernetes and container orchestration experience
  • Cloud platform proficiency (AWS, GCP, or Azure)
  • Infrastructure as Code (Terraform, Ansible) skills
  • Experience with monitoring tools (Prometheus, Grafana, Datadog, ELK)
  • Security best practices knowledge
  • Excellent troubleshooting and problem-solving skills
  • Strong collaboration and communication abilities

Benefits For Staff Site Reliability Engineer

Medical Insurance
Dental Insurance
Vision Insurance
  • Medical Insurance
  • Dental Insurance
  • Vision Insurance

Interested in this job?

Jobs Related To Acquia Staff Site Reliability Engineer

Senior Staff Operations Engineer

Senior Staff Operations Engineer position at Airbnb, focusing on observability architecture and automation within the BizTech department.

Staff Program Manager, BizTech Global Ops

Staff Program Manager position at Airbnb overseeing technical production services, requiring 9+ years experience, offering remote work and competitive compensation.

Site Facilities Operations Manager (Thai, English)

Lead data center facilities operations at Google Bangkok, managing critical infrastructure and teams while ensuring optimal performance and efficiency.

Site Operations Manager, Data Center Operations

Lead Google's data center operations team, overseeing critical infrastructure and driving operational excellence while managing technical teams and systems.

Staff Software Engineer, Engineering Productivity, YouTube

Staff Software Engineer position at YouTube focusing on Engineering Productivity, leading technical projects and improving developer tools with competitive compensation and benefits.