Staff Engineer, Platform/SRE

Leading omnichannel customer engagement solution powering personalized customer journeys across mobile and web push notifications, in-app messaging, SMS, and email.
Site Reliability
Staff Software Engineer
Remote
101 - 500 Employees
8+ years of experience
Enterprise SaaS

Description For Staff Engineer, Platform/SRE

OneSignal is a leading omnichannel customer engagement solution that has achieved remarkable scale, serving billions of HTTP requests daily. As a Series C venture-backed company supported by SignalFire, Rakuten Ventures, and Y Combinator, we're revolutionizing how businesses connect with their users. The Platform/SRE team plays a crucial role in maintaining our 99.95% uptime while scaling our infrastructure.

As a Staff Engineer in Platform/SRE, you'll be at the forefront of our infrastructure engineering efforts, working with cutting-edge technologies like Rust and Go. You'll be responsible for operating and engineering our infrastructure's future, managing critical systems including PostgreSQL, Scylla, Redis, Kafka, and Kubernetes. Your role combines software engineering with infrastructure focus, requiring both deep technical knowledge and automation expertise.

The position offers unique challenges in scaling our systems while maintaining high reliability. You'll work with a diverse tech stack, automate data center operations, and collaborate across teams to architect highly scalable services. The role requires not just operational knowledge but the ability to transform manual processes into automated systems.

OneSignal offers a remote-first culture with offices in San Mateo, CA and London, UK. We pride ourselves on maintaining a healthy work-life balance and fostering an environment of ownership and personal growth. Join us in shaping the future of customer engagement technology while working with a team that values both technical excellence and human connection.

Last updated 5 days ago

Responsibilities For Staff Engineer, Platform/SRE

  • Optimize and elevate system performance by identifying bottlenecks
  • Set up infrastructure and configuration as code using Kubernetes and Terraform
  • Establish and maintain observability and monitoring stack
  • Define and implement CI/CD best practices
  • Collaborate with engineering teams to architect scalable services
  • Participate in on-call rotation and incident response

Requirements For Staff Engineer, Platform/SRE

Go
Rust
Kubernetes
PostgreSQL
Redis
Kafka
  • At least 8 years of platform experience
  • Experience operating reliable production systems at scale
  • Knowledge of Linux systems internals
  • Ability to automate tasks
  • Experience in managing PostgreSQL for high-scale throughput systems
  • Operational experience deploying and managing Kubernetes
  • Experience working with Cloud Providers (AWS/GCP/Azure)

Interested in this job?

Jobs Related To OneSignal Staff Engineer, Platform/SRE

Software Engineering Manager II, Namespaces Site Reliability Engineering

Lead Google's Namespaces SRE team, managing distributed systems and storage infrastructure while ensuring reliability and performance of critical services.

Software Engineering Manager II, Site Reliability Engineering

Lead Google's Site Reliability Engineering team in maintaining and optimizing large-scale distributed systems while managing and mentoring software engineers.

Software Engineering Manager II, Site Reliability Engineering

Lead Google's Site Reliability Engineering team in ensuring the reliability and performance of large-scale distributed systems while managing and mentoring engineering talent.

Senior Staff Software Engineer, Site Reliability Engineering

Senior Staff SRE position at Google, focusing on building and maintaining large-scale distributed systems for Google Cloud services.

Senior Staff Software Engineer, Site Reliability Engineering, Google Cloud

Senior Staff SRE position at Google Cloud, focusing on building and maintaining large-scale distributed systems with competitive compensation and benefits.