Site Reliability Engineer

BrainGu is a technology company that builds developer platforms, focusing on creating dual-use technology platforms that unlock innovation.
$150,000 - $170,000
Site Reliability
Senior Software Engineer
In-Person
6+ years of experience
Enterprise SaaS

Description For Site Reliability Engineer

BrainGu is an innovative technology company specializing in building developer platforms with a focus on their flagship Developer Experience Platform, SmoothGlue. As a Site Reliability Engineer within the Engineering Operations Value Stream, you'll play a crucial role in implementing and maintaining highly available, scalable, and secure systems.

The position offers an exciting opportunity to work with cutting-edge technologies including Kubernetes, container technologies, and major cloud platforms. You'll be responsible for designing and implementing infrastructure as code, managing CI/CD pipelines, and ensuring system reliability and performance. The role requires a strong background in SRE practices and the ability to automate processes effectively.

Working closely with the EngOps CTO and Platform Product team, you'll have the chance to influence roadmaps and drive organizational maturity. The company offers an impressive benefits package including fully paid insurance plans, generous PTO, parental leave, and professional development opportunities through their BrainBudget program.

The ideal candidate will bring 6+ years of relevant experience, strong technical expertise in cloud technologies, and excellent communication skills. This role is perfect for someone who thrives in a collaborative environment, enjoys mentoring others, and is passionate about building resilient, scalable systems. The position offers competitive compensation ranging from $150,000 to $170,000 and requires occasional travel.

What sets this opportunity apart is the chance to work on dual-use technology platforms that make a real impact, combined with a culture that values innovation and continuous improvement. The role offers significant growth potential and the opportunity to work with a team dedicated to creating order-of-magnitude improvements in software quality, resilience, and security.

Last updated 2 months ago

Responsibilities For Site Reliability Engineer

  • Design, implement, and manage highly available, scalable, and fault-tolerant systems
  • Collaborate with software engineering teams to optimize application performance and reliability
  • Develop and maintain infrastructure as code (IaC)
  • Implement CI/CD pipelines
  • Establish and maintain monitoring, alerting, and logging systems
  • Respond to incidents and troubleshoot issues
  • Analyze system performance and identify bottlenecks
  • Conduct capacity planning
  • Implement security best practices
  • Provide mentorship and technical leadership

Requirements For Site Reliability Engineer

Kubernetes
Redis
  • Bachelor's degree or equivalent work experience
  • 6+ years of relevant work experience
  • Experience with k8s and container technologies
  • Experience troubleshooting cloud environments (AWS, GCP or Azure)
  • Experience with HashiCorp Vault or CyberArk
  • AWS Solutions Architect - Associate certification preferred
  • Must have an active Secret clearance
  • Willing to travel up to 50%
  • U.S. Citizenship required

Benefits For Site Reliability Engineer

401k
Medical Insurance
Dental Insurance
Vision Insurance
Parental Leave
Education Budget
  • 12 weeks of fully paid parental leave
  • 31 days of PTO including federal holidays
  • 100% employer-paid insurance plans
  • 401(k) matching up to 5%
  • $10k BrainBudget for personal and professional growth
  • $1,500 Battle Station Budget for home office
  • 85% paid healthcare premiums for family
  • Monthly cell phone and internet stipend
  • Supplemental Tricare plan for Veterans
  • Monthly stipend for Veterans

Interested in this job?

Jobs Related To BrainGu Site Reliability Engineer

Site Reliability Engineer

Senior Site Reliability Engineer role at AION, building and maintaining infrastructure for a decentralized AI cloud platform with focus on automation and reliability.

Senior Software Developer, Site Reliability Engineering, Google Cloud

Senior Software Developer role in Site Reliability Engineering at Google Cloud, focusing on building and maintaining large-scale distributed systems with emphasis on reliability and automation.

Senior Software Developer, Site Reliability Engineering, Google Cloud

Senior SRE role at Google Cloud focusing on building and maintaining large-scale distributed systems with competitive compensation and comprehensive benefits.

Senior Software Engineer, SRE, Cloud Incident Response

Senior SRE position at Google focusing on Cloud Incident Response, requiring expertise in distributed systems and incident management.

Senior Software Engineer, Site Reliability Engineering

Senior Site Reliability Engineering role at Google, focusing on building and maintaining large-scale distributed systems for Google Cloud services.