Senior Site Reliability Engineer

Site Reliability
Senior Software Engineer
In-Person
8+ years of experience
Enterprise SaaS

Description For Senior Site Reliability Engineer

XperiencOps Inc is seeking a Senior Site Reliability Engineer to ensure the reliability, scalability, and performance of their enterprise software platform. This role demands deep technical expertise in AWS cloud technologies and serverless architectures, focusing on maintaining 24/7 platform availability for major enterprise customers. The position combines hands-on technical work with leadership responsibilities, including system design, incident management, automation development, and mentoring junior engineers. The ideal candidate will have extensive experience in cloud services, particularly AWS, with strong skills in observability systems and infrastructure as code. This role offers the opportunity to work with cutting-edge technologies while making a significant impact on platform stability and customer satisfaction. The position requires participation in on-call rotations and thrives in a fast-paced startup environment where adaptability and proactive problem-solving are essential. Benefits include comprehensive health coverage and the excitement of joining a growing startup.

Last updated 3 months ago

Responsibilities For Senior Site Reliability Engineer

  • Design, implement, and manage highly available and scalable systems
  • Monitor, troubleshoot, and resolve platform incidents
  • Lead post-incident reviews
  • Develop and maintain automation for infrastructure management
  • Optimize platform performance and scalability
  • Contribute to CI/CD pipelines
  • Partner with L2 engineers to resolve complex customer issues
  • Work closely with product engineering
  • Mentor junior engineers and provide technical leadership
  • Drive cross-functional initiatives to improve platform stability

Requirements For Senior Site Reliability Engineer

Python
Go
Linux
Kubernetes
  • Bachelor's degree in Computer Science or related discipline
  • 8+ years in a Site Reliability Engineering or DevOps role
  • 3+ years of experience in cloud services, particularly AWS
  • Experience building observability systems on New Relic, Cloudwatch or similar
  • Experience implementing rate-limiting, API gateways, and load balancing
  • Exposure to security best practices and compliance frameworks
  • Proficient in infrastructure as code (IaC)
  • Hands-on experience with scripting and programming languages
  • Strong troubleshooting and debugging skills
  • Excellent communication and collaboration skills
  • Experience with incident management and post-mortem practices
  • Availability to participate in 24/7 on-call rotation

Benefits For Senior Site Reliability Engineer

Medical Insurance
Dental Insurance
Vision Insurance
  • Opportunity to work on cutting-edge products and make a real impact
  • Collaborative and fast-paced work environment
  • Chance to be part of a rapidly growing startup
  • Competitive salary and benefits package
  • Paid time off

Interested in this job?

Jobs Related To XperiencOps Inc Senior Site Reliability Engineer

Senior Software Developer, Site Reliability Engineering, Google Cloud

Senior SRE role at Google Cloud focusing on building and maintaining large-scale distributed systems with competitive compensation and comprehensive benefits.

Senior Software Engineer, Site Reliability Engineering, Google Cloud

Senior SRE position at Google Cloud focusing on building and maintaining large-scale distributed systems, requiring 5 years of software development experience and strong system design skills.

Senior Site Reliability Engineer - AI Research Clusters

Senior Site Reliability Engineer position at NVIDIA focusing on AI research clusters, offering competitive compensation and the opportunity to work with cutting-edge GPU technology.

Senior Systems Reliability Operations Engineer

Senior SRE role at Disney Technology Operations Command Center focusing on service reliability, incident management, and operational excellence.

Senior Software Developer, Site Reliability Engineering, Google Cloud

Senior SRE role at Google Cloud focusing on building and maintaining large-scale distributed systems with competitive compensation and multiple location options.