Sr. Cloud Site Reliability Engineer

Serve Robotics develops sidewalk delivery robots to reimagine how things move in cities, focusing on making deliveries more efficient and accessible.
$150,000 - $200,000
Site Reliability
Senior Software Engineer
Remote
5+ years of experience
Robotics

Description For Sr. Cloud Site Reliability Engineer

Serve Robotics is revolutionizing urban delivery through innovative sidewalk robots. As a Sr. Cloud Site Reliability Engineer, you'll join a team of tech industry veterans solving real-world problems using robotics, machine learning, and computer vision. This senior-level position combines hands-on technical work with leadership responsibilities, focusing on building and maintaining critical SRE tooling and processes.

You'll be instrumental in developing and implementing monitoring solutions, managing service level objectives, and ensuring system reliability. The role involves leading incident response processes, conducting performance tuning, and mentoring other engineers in SRE practices. You'll work closely with cross-functional teams to align system health with business metrics and foster a culture of operational excellence.

The ideal candidate brings 5+ years of SRE experience, strong technical expertise in cloud platforms, containerization, and observability tools, combined with excellent leadership and communication skills. You'll be working in a fast-paced environment where you can directly impact the future of autonomous delivery systems.

The position offers competitive compensation ($150K-$200K) with equity, and the flexibility of remote work. You'll be part of an agile, diverse team that values collaborative problem-solving and respectful communication. This is an opportunity to shape the future of urban delivery while working with cutting-edge technology in a rapidly growing field.

Last updated 3 days ago

Responsibilities For Sr. Cloud Site Reliability Engineer

  • Develop and refine monitoring and observability tools for system availability and performance
  • Implement best practices for instrumentation using tools like Prometheus, Grafana, Datadog
  • Lead the definition and management of SLIs and SLOs
  • Perform capacity planning, load testing, and performance tuning
  • Own the incident response process including on-call rotation
  • Conduct and facilitate postmortems
  • Create reporting dashboards connecting reliability data with KPIs
  • Mentor junior and mid-level engineers
  • Conduct training sessions and share knowledge

Requirements For Sr. Cloud Site Reliability Engineer

Kubernetes
Python
Go
  • 5+ years of experience in Site Reliability Engineering, DevOps, or similar role
  • Experience with major cloud providers (Google Cloud, AWS, Azure)
  • Proficiency in Docker, Kubernetes, or similar containerization platforms
  • Hands-on experience with logging, metrics, and tracing tools
  • Familiarity with Infrastructure-as-Code and scripting
  • Experience with modern CI/CD pipelines
  • Bachelor's degree in Computer Science, Engineering, or related field
  • Strong leadership and communication skills
  • Strong analytical and problem-solving abilities

Benefits For Sr. Cloud Site Reliability Engineer

  • Equity

Interested in this job?

Jobs Related To Serve Robotics Sr. Cloud Site Reliability Engineer

Sr. Cloud Site Reliability Engineer

Senior Cloud Site Reliability Engineer position at Serve Robotics, focusing on building and maintaining critical infrastructure for autonomous delivery robots.

Site Reliability Engineer

Senior Site Reliability Engineer role at Blockchain, focusing on infrastructure, security, and scalability for a leading digital assets platform.

Site Reliability Engineer (SRE) - Apple Services Engineering / iCloud

Senior SRE position at Apple working on iCloud services, offering competitive pay and benefits, requiring strong Linux and distributed systems experience.

Sr. Site Reliability Engineer (SRE) - iCloud Edge & Messaging (ASE)

Senior SRE position at Apple focusing on iCloud Edge & Messaging services, offering competitive salary and opportunity to work on massive-scale systems.

Senior Site Reliability Engineer

Senior Site Reliability Engineer role at PointClickCare, managing Kubernetes clusters and cloud infrastructure for healthcare technology platform.