Site Reliability Engineer

Site Reliability
Hybrid
Enterprise SaaS · E-Commerce

Description For Site Reliability Engineer

commercetools is seeking a Site Reliability Engineer to join their team in a hybrid work environment. This role focuses on managing critical infrastructure across AWS and GCP, with an emphasis on automation, reliability, and enhancing the developer experience. The ideal candidate will work with cutting-edge technologies like Terraform, Kubernetes, and GitOps practices while contributing to a diverse, international team.

The position offers an exciting opportunity to work on platform engineering, developing infrastructure automation, optimizing multi-cloud environments, and creating self-service platforms. The role requires expertise in cloud providers, Infrastructure as Code, and strong automation capabilities. The company values diversity and inclusion, offering comprehensive benefits including workation policies, learning opportunities, and flexible working arrangements.

As an SRE at commercetools, you'll be part of a forward-thinking environment where continuous improvement and knowledge sharing are encouraged. The company emphasizes both technical excellence and personal growth, providing various learning resources and development opportunities. They're committed to creating meaningful change in the industry while maintaining a culture driven by their Guiding Stars: Drive Results, Cultivate Belonging, Champion Customers, and Adapt Boldly.

The role combines technical challenges with collaborative opportunities, making it ideal for someone who wants to grow their platform engineering skills while working in an inclusive, international environment. The company offers competitive compensation, including stock options, and emphasizes work-life balance through flexible arrangements and comprehensive benefits.

Last updated 22 days ago

Responsibilities For Site Reliability Engineer

  • Develop infrastructure automation using Terraform and Crossplane
  • Optimize Kubernetes environments across multiple cloud providers
  • Create self-service platforms and workflows using Spacelift and GitOps practices
  • Participate in on-call rotations for infrastructure and platform services
  • Work closely with product teams to understand their needs and develop platform solutions
  • Develop scalable tools for automation, address infrastructure drift proactively, and implement security best practices
  • Engage in pair programming, provide constructive code reviews, and foster knowledge sharing

Requirements For Site Reliability Engineer

Go
Python
Kubernetes
  • Practical experience with at least two major cloud providers (AWS and GCP)
  • Demonstrated experience with Infrastructure as Code, particularly Terraform
  • Working knowledge of Kubernetes and its ecosystem
  • Understanding of GitOps practices, CI/CD pipelines, and experience with automation tools
  • Strong automation and scripting capabilities (Python, Bash, Go)
  • Experience with monitoring and observability tools such as Prometheus and Grafana
  • Excellent problem-solving abilities, including expertise in root cause analysis
  • Clear written and verbal communication skills in English
  • Enthusiasm for working in diverse, distributed international teams

Benefits For Site Reliability Engineer

Education Budget
Equity
  • Competitive Compensation Package with salary and stock options
  • Work up to 60 days per year in a different country (Workation)
  • Learning & Development Budget
  • Access to Coursera and Babbel training courses
  • Flexible working hours
  • Diverse and international workplace

Interested in this job?