Site Reliability Engineer - Platform Microservices Reliability

Guidewire is the platform P&C insurers trust to engage, innovate, and grow efficiently. We provide software for Property and Casualty (P&C) Insurance companies to handle core operations, data management, digital online portals, and predictive analytics.
Site Reliability
Senior Software Engineer
Hybrid
1,000 - 5,000 Employees
5+ years of experience
Finance · Enterprise SaaS

Description For Site Reliability Engineer - Platform Microservices Reliability

At Guidewire, we develop software for Property and Casualty (P&C) Insurance companies, providing tools for core applications, data management, digital portals, and predictive analytics. As a Site Reliability Engineer on the Platform team, you'll be crucial in automating and improving the reliability of Guidewire's cloud platform and InsuranceSuite products.

Your role will involve:

  • Taking a purist SRE approach to shared multi-tenant infrastructure
  • Overseeing and automating AWS presence
  • Contributing to core infrastructure systems development
  • Platform reliability engineering for authentication systems
  • Building tooling for 24x7x365 operations
  • Automating deployment tasks
  • Creating documentation and training materials
  • Improving incident management and platform observability

You'll work in a collaborative environment, solving problems at scale with technologies like AWS, Kubernetes, and Aurora. The ideal candidate has a passion for automation, can rapidly self-educate, and has experience with production support of SaaS platforms in cloud-native environments.

Guidewire offers a fun work environment with a culture based on integrity, rationality, and collegiality. We're recognized as a Top Cloud Employer on Glassdoor and a market leader by Gartner. Join us to make an impact, be inspired by your colleagues, and enjoy a career where you're trusted and empowered to go further.

Required skills include software engineering with Bash, Python, or Go, deep Linux knowledge, AWS expertise, experience with containerization, and familiarity with various DevOps tools and practices. You should have a strong background in production support, database management, and observability tools.

This role requires occasional travel (less than 5%) and participation in rotating on-call support. If you're passionate about reliability, automation, and solving complex problems in a collaborative environment, we'd love to hear from you!

Last updated 15 days ago

Responsibilities For Site Reliability Engineer - Platform Microservices Reliability

  • Automate and improve reliability of Guidewire's cloud platform and InsuranceSuite products
  • Oversee and automate AWS presence
  • Contribute to core infrastructure systems development
  • Platform reliability engineering for authentication systems
  • Build tooling for 24x7x365 operations
  • Automate deployment tasks
  • Create system documentation and training materials
  • Build and maintain observability tooling and dashboards
  • Improve incident management lifecycle
  • Enhance platform observability
  • Collaborate with engineering teams

Requirements For Site Reliability Engineer - Platform Microservices Reliability

Python
Go
Linux
Java
Kubernetes
PostgreSQL
  • Bachelor's Degree in Computer Science or related field
  • Software engineering skills with Bash, Python, and/or Go
  • Deep background with Linux systems and engineering
  • Experience with Amazon Web Services (AWS)
  • Experience supporting web applications running on Java / Apache / Tomcat
  • Experience with IaC tools like Terraform/Terragrunt/Terraspace
  • Experience with devops/gitops tools (Git, Bitbucket, Flux CD, Teamcity)
  • Expertise in containerization (Docker, Helm, Kubernetes/EKS)
  • Understanding of Single-Sign On, SAML, OAuth
  • Experience with Relational Databases such as Aurora Postgres and/or Oracle RDS
  • Experience with observability tools like Datadog, CloudWatch, and PagerDuty

Interested in this job?

Jobs Related To Guidewire Site Reliability Engineer - Platform Microservices Reliability

Senior Site Reliability Engineer

Senior SRE position at Apple working on satellite communications infrastructure, building and maintaining critical systems for emergency services.

Site Reliability Engineer- SRE

Senior Site Reliability Engineer position at Apple, focusing on platform engineering and cloud infrastructure for hardware engineering tools and data analytics.

Senior Production SRE Engineer - Storage

Senior Production SRE Engineer role at NVIDIA focusing on storage systems, requiring 5+ years experience in managing large-scale infrastructure and strong programming skills.

Senior Site Reliability Engineer

Senior Site Reliability Engineer role at Truecaller, focusing on infrastructure management and system reliability for a global communication platform.

Senior Software Developer, Site Reliability Engineering, Google Cloud

Senior SRE role at Google Cloud focusing on building and maintaining large-scale distributed systems with emphasis on reliability and scalability.