At Webflow, our mission is to bring development superpowers to everyone. Webflow is the leading visual development platform for building powerful websites without writing code. By combining modern web development technologies into one platform, Webflow enables people to build websites visually, saving engineering time, while clean code seamlessly generates in the background. From independent designers and creative agencies to Fortune 500 companies, millions worldwide use Webflow to be more nimble, creative, and collaborative. It's the web, made better.
We're looking for a Senior Site Reliability Engineer to improve reliability and stability of Webflow's customer-facing, production infrastructure, serving millions of page views per hour. Our product is used by over 2 million users world-wide across 190 countries, and you'll help ensure our platform is secure and scalable for these users as tens of thousands of projects are launched on Webflow each month.
As a Senior Site Reliability Engineer, you'll:
- Empower engineers on other teams to take control of their services by maintaining monitoring tooling and collaborating on internal best practices for observability.
- Enhance reliability of applications running in Kubernetes by optimizing resource allocation, streamlining upgrade processes, and ensuring scalability and fault tolerance.
- Occasionally dive into the main Webflow application in Node, Python, or Go to better discern (and sometimes fix) behavior in production.
- Work with peers on Webflow's Customer Support, Partnerships, and Sales teams to enable customers using Webflow's services in production.
- Participate in and continuously improve on-call and incident response processes.
You'll thrive as a Senior Site Reliability Engineer if you have:
- Either a background as an ops engineer with an enthusiasm for code, or a background as a software engineer with an enthusiasm for systems administration.
- 5+ years of experience building, maintaining, and debugging distributed systems in a customer-facing environment that allows for little to no downtime.
- Experience navigating and scaling multi-tier cloud environments on either AWS or GCP.
- Experience with container-centric architectures, built with Docker and tools like Kubernetes (EKS, GKE, AKS, OpenShift, etc.), ECS, Docker Swarm, or Mesos.
- Experience with infrastructure-as-code tools like Terraform, Pulumi, Ansible, Puppet, or Chef.
- Experience in contributing to full-stack applications built using tools like React, Node, and MongoDB.
- Enthusiasm for mentoring and sponsoring less-experienced engineers.
Bonus points for:
- Experience with Kubernetes, Nginx, Terraform, or Pulumi specifically.
- Experience improving on-call and incident response processes for Engineering.
- Experience working in high-compliance environments or a special interest in security engineering.
Join us at Webflow to build the future of the web and empower millions of users worldwide!