Site Reliability Engineer

BrainGu is a technology company that builds developer platforms, focusing on creating dual-use technology platforms that unlock innovation.
$150,000 - $170,000
Site Reliability
Senior Software Engineer
In-Person
6+ years of experience
Enterprise SaaS

Description For Site Reliability Engineer

BrainGu is an innovative technology company specializing in building developer platforms with a focus on their flagship Developer Experience Platform, SmoothGlue. As a Site Reliability Engineer within the Engineering Operations Value Stream, you'll play a crucial role in implementing and maintaining highly available, scalable, and secure systems.

The position offers an exciting opportunity to work with cutting-edge technologies including Kubernetes, container technologies, and major cloud platforms. You'll be responsible for designing and implementing infrastructure as code, managing CI/CD pipelines, and ensuring system reliability and performance. The role requires a strong background in SRE practices and the ability to automate processes effectively.

Working closely with the EngOps CTO and Platform Product team, you'll have the chance to influence roadmaps and drive organizational maturity. The company offers an impressive benefits package including fully paid insurance plans, generous PTO, parental leave, and professional development opportunities through their BrainBudget program.

The ideal candidate will bring 6+ years of relevant experience, strong technical expertise in cloud technologies, and excellent communication skills. This role is perfect for someone who thrives in a collaborative environment, enjoys mentoring others, and is passionate about building resilient, scalable systems. The position offers competitive compensation ranging from $150,000 to $170,000 and requires occasional travel.

What sets this opportunity apart is the chance to work on dual-use technology platforms that make a real impact, combined with a culture that values innovation and continuous improvement. The role offers significant growth potential and the opportunity to work with a team dedicated to creating order-of-magnitude improvements in software quality, resilience, and security.

Last updated 2 hours ago

Responsibilities For Site Reliability Engineer

  • Design, implement, and manage highly available, scalable, and fault-tolerant systems
  • Collaborate with software engineering teams to optimize application performance and reliability
  • Develop and maintain infrastructure as code (IaC)
  • Implement CI/CD pipelines
  • Establish and maintain monitoring, alerting, and logging systems
  • Respond to incidents and troubleshoot issues
  • Analyze system performance and identify bottlenecks
  • Conduct capacity planning
  • Implement security best practices
  • Provide mentorship and technical leadership

Requirements For Site Reliability Engineer

Kubernetes
Redis
  • Bachelor's degree or equivalent work experience
  • 6+ years of relevant work experience
  • Experience with k8s and container technologies
  • Experience troubleshooting cloud environments (AWS, GCP or Azure)
  • Experience with HashiCorp Vault or CyberArk
  • AWS Solutions Architect - Associate certification preferred
  • Must have an active Secret clearance
  • Willing to travel up to 50%
  • U.S. Citizenship required

Benefits For Site Reliability Engineer

401k
Medical Insurance
Dental Insurance
Vision Insurance
Parental Leave
Education Budget
  • 12 weeks of fully paid parental leave
  • 31 days of PTO including federal holidays
  • 100% employer-paid insurance plans
  • 401(k) matching up to 5%
  • $10k BrainBudget for personal and professional growth
  • $1,500 Battle Station Budget for home office
  • 85% paid healthcare premiums for family
  • Monthly cell phone and internet stipend
  • Supplemental Tricare plan for Veterans
  • Monthly stipend for Veterans

Interested in this job?

Jobs Related To BrainGu Site Reliability Engineer

Site Reliability Engineer (SRE) Specialist

Senior Site Reliability Engineer position at Capco, focusing on cloud operations and ServiceNow implementation in financial services sector.

Senior Site Reliability Engineer

Lead SRE position at PayPay, managing observability pipeline across EKS clusters and driving reliability culture.

Senior Site Reliability Engineer

Lead SRE position at PayPay, managing observability pipeline across EKS clusters and driving reliability culture.

Site Reliability Engineer (SRE) Specialist

Senior Site Reliability Engineer position at Capco, focusing on cloud operations and ServiceNow implementation in financial services sector.

Senior Site Reliability Engineer

Senior Site Reliability Engineer position at BetterUp, focusing on system reliability, infrastructure management, and operational excellence in a mission-driven company.