Lead Site Reliability Engineer

Parent company of Bumble, Badoo, Bumble For Friends, and Geneva, pioneering dating and social connection platforms founded in 2014.
Site Reliability
Staff Software Engineer
Hybrid
1,000 - 5,000 Employees
8+ years of experience
Consumer

Description For Lead Site Reliability Engineer

Bumble Inc., the parent company behind popular platforms like Bumble, Badoo, and Bumble For Friends, is seeking a Lead Site Reliability Engineer to join their team. This role is crucial in ensuring the reliability, scalability, and performance of their software systems while bridging the gap between development, security, and operations.

The position offers a unique opportunity to work with a company that's revolutionizing online connections across dating, friendship, and professional networking. As an SRE Lead, you'll be responsible for designing and implementing robust infrastructure solutions, building automation frameworks, and maintaining highly available systems that support millions of users worldwide.

The ideal candidate will bring strong technical expertise in Python or Golang, Kubernetes, and cloud architectures, combined with excellent problem-solving and communication skills. You'll work with cutting-edge technologies and tools while collaborating with cross-functional teams to improve system reliability and performance.

Bumble Inc. stands out for its commitment to inclusion and diversity, actively encouraging applications from people of all backgrounds. They offer a supportive work environment where continuous learning is valued and innovation is encouraged. The hybrid work arrangement in London provides flexibility while maintaining collaborative opportunities with the team.

This role is perfect for a seasoned SRE professional who is passionate about building scalable systems, loves solving complex technical challenges, and wants to contribute to a platform that makes a meaningful impact on how people connect worldwide.

Last updated 5 hours ago

Responsibilities For Lead Site Reliability Engineer

  • Design and build new tools and services from the ground up to solve complex problems
  • Build automation frameworks to streamline repetitive tasks
  • Design and maintain scalable, highly available and fault-tolerant systems
  • Build and maintain observability tooling including logging, Monitoring, tracing and alerting systems
  • Develop and maintain automation tooling to reduce manual intervention
  • Implement infrastructure as code (IaC) for infrastructure provisioning
  • Monitor system health and performance, identifying and fixing issues
  • Respond to system outages, troubleshooting root causes and implementing preventative measures
  • Collaborate with engineering teams and security engineers to improve system reliability, security and performance
  • Participate in on-call rotations
  • Create and maintain documentation to improve knowledge sharing across teams

Requirements For Lead Site Reliability Engineer

Python
Go
Kubernetes
Linux
Kafka
  • Excellent problem solving, analytical skills
  • Strong communication and collaboration skills
  • Proficiency in at least Python or Golang programming languages
  • Experience with CI/CD pipelines
  • Strong Proficiency with Kubernetes architecture
  • Prior experience in SRE, System administration or DevOps roles
  • Strong proficiency with Linux/Unix operating systems
  • Proficiency with using Puppet for configuration management
  • Hands-on experience in Monitoring and observability platforms
  • Experience with Cloud architectures such as GCP or AWS
  • Familiarity with SQL databases and broker systems such as Kafka

Interested in this job?

Jobs Related To Bumble Inc. Lead Site Reliability Engineer

Lead Site Reliability Engineer

Lead SRE position at Bumble Inc., focusing on infrastructure reliability and scalability, offering competitive compensation and hybrid work arrangement in NYC or Austin.

Staff Software Engineer, Reliability Engineering

Staff Software Engineer position at Airbnb focusing on Site Reliability Engineering, building and maintaining systems for service reliability at scale with incident management responsibilities.

Site Reliability Engineer – AIOps

Senior Site Reliability Engineer role focusing on AIOps at Oracle, building AI-driven solutions for cloud infrastructure reliability and automation.

Staff Site Reliability Engineer

Staff Site Reliability Engineer position at ClickUp, focusing on maintaining and improving the reliability of their all-in-one work management platform.

Staff Site Reliability Engineer

Staff Site Reliability Engineer position at Perchwell, leading infrastructure and reliability initiatives for a modern real estate technology platform in New York.