Site Reliability Engineer, Managed Service

SingleStore is a platform for all data, enabling enterprises to adapt to change and accelerate innovation.
Site Reliability
Mid-Level Software Engineer
Hybrid
3+ years of experience
Enterprise SaaS

Description For Site Reliability Engineer, Managed Service

SingleStore is seeking a Site Reliability Engineer to help optimize and scale our managed service offering across all three major cloud providers. In this role, you will be at the intersection of leading technology trends – A highly performant distributed database, managed by Kubernetes, running in the cloud. This is a great opportunity to push the boundaries with a cloud focused SRE role.

This is a development role, requiring an engineering mindset to solve operational challenges. You will be part of a globally distributed team of engineers, helping to drive SRE practices across the company. Through infrastructure automation, you will help us grow our service across multiple cloud platforms. This requires a relentless focus on eliminating manual processes. You will also leverage our monitoring platform to improve the overall customer experience by systematically identifying and fixing any issues impacting our customers. As an SRE, you will also help diagnose issues on the platform, leveraging a deep understanding of the SingleStore query engine along with the backend infrastructure.

Roles and Responsibilities:

  • Develop automation platform to manage infrastructure rollouts across cloud providers
  • Optimize telemetry platform to identify customer impacting events while providing relevant data to drive debugging
  • Partner with engineering team to optimize performance of services for cloud architecture
  • Debug Live Site events and conduct follow-up postmortem and RCA analysis
  • Participate in an SLA-driven on-call rotation, which will include after-hours, weekend, and rotating holiday participation

Required Skills and Experience:

  • Infrastructure automation experience. Python and Golang a plus.
  • Knowledge of Kubernetes and the container ecosystem
  • Strong cross group collaboration and communication skills
  • Familiar with at least one of AWS, Azure, or Google Cloud
  • Experience debugging, diagnosing and troubleshooting complex, production software
  • B.S. Degree in Computer Science or related field

SingleStore is one platform for all data, built so you can engage with insight in every moment. Trusted by industry leaders, SingleStore enables enterprises to adapt to change as it happens, embrace diverse data with ease, and accelerate the pace of innovation. SingleStore is venture-backed and headquartered in San Francisco with offices in Sunnyvale, Seattle, Boston, London, Lisbon, Bangalore, Dublin and Kyiv. Defining the future starts with The Single Database for All Data-Intensive Applications.

Last updated a month ago

Responsibilities For Site Reliability Engineer, Managed Service

  • Develop automation platform to manage infrastructure rollouts across cloud providers
  • Optimize telemetry platform to identify customer impacting events while providing relevant data to drive debugging
  • Partner with engineering team to optimize performance of services for cloud architecture
  • Debug Live Site events and conduct follow-up postmortem and RCA analysis
  • Participate in an SLA-driven on-call rotation, which will include after-hours, weekend, and rotating holiday participation

Requirements For Site Reliability Engineer, Managed Service

Python
Go
Kubernetes
  • Infrastructure automation experience
  • Knowledge of Kubernetes and the container ecosystem
  • Strong cross group collaboration and communication skills
  • Familiar with at least one of AWS, Azure, or Google Cloud
  • Experience debugging, diagnosing and troubleshooting complex, production software
  • B.S. Degree in Computer Science or related field

Benefits For Site Reliability Engineer, Managed Service

  • Technology Stipend for New Employees
  • Company and team events
  • Flexible time off
  • Volunteer time off
  • US Stock Options

Interested in this job?

Jobs Related To SingleStore Site Reliability Engineer, Managed Service

Site Reliability Engineer II

Microsoft seeks a Site Reliability Engineer II for their Commerce and Ecosystems team to manage and automate large-scale platforms.

Software Developer II, Site Reliability Development, Google Cloud

Google Cloud seeks a Software Developer II for Site Reliability Development to build and maintain large-scale, fault-tolerant systems.

Software Developer II, Site Reliability Developing, Google Cloud

Google Cloud seeks a Software Developer II for Site Reliability Engineering to build and maintain large-scale, fault-tolerant systems.

Site Reliability Engineering, Transformative Compute Site Reliability Engineering

Google is seeking a Mid-Level Site Reliability Engineer to build and maintain large-scale distributed systems for Google Cloud services.

Site Reliability Engineering, Transformative Compute Site Reliability Engineering

Join Google as a Site Reliability Engineer to build and maintain large-scale, fault-tolerant systems for Google Cloud services.