Site Reliability Engineer (SRE)

AI-powered Personal & Entrepreneurial Resource Planner (PRP) company founded in 2018, with over 100 million downloads worldwide.
$116,000 - $200,000
Site Reliability
Senior Software Engineer
Hybrid
4+ years of experience
AI · Enterprise SaaS

Description For Site Reliability Engineer (SRE)

Air Apps is revolutionizing resource management with their AI-powered Personal & Entrepreneurial Resource Planner (PRP). Founded in 2018 in Lisbon and now operating from both Lisbon and San Francisco, they've achieved remarkable success with over 100 million downloads worldwide while remaining self-funded.

As a Site Reliability Engineer (SRE), you'll be at the forefront of ensuring system reliability and scalability. This role combines software development and operations, requiring expertise in cloud platforms, Infrastructure as Code, and modern DevOps practices. You'll work with cutting-edge technologies including Kubernetes, monitoring tools like Prometheus and Grafana, and various cloud platforms.

The position offers an attractive compensation package ranging from $116K to $200K, along with comprehensive benefits including medical insurance, 401k matching, and an annual stipend. The hybrid work environment provides flexibility while maintaining collaborative opportunities.

Key responsibilities include designing scalable systems, implementing automation, managing observability tools, and participating in on-call rotations. The ideal candidate brings 4+ years of SRE/DevOps experience, strong Linux administration skills, and expertise in cloud platforms and containerization.

Air Apps offers a unique opportunity to join a rapidly growing company that values innovation and technical excellence. Their commitment to diversity and inclusion, combined with their mission to transform how people plan, work, and live, makes this an exciting opportunity for ambitious engineers looking to make a significant impact.

Last updated 3 days ago

Responsibilities For Site Reliability Engineer (SRE)

  • Design and implement scalable, reliable, and fault-tolerant systems across cloud environments
  • Develop and maintain observability tools, including monitoring, logging, and alerting
  • Automate infrastructure provisioning, deployment, and incident response using Infrastructure as Code (IaC)
  • Optimize system performance, scalability, and incident response workflows
  • Work closely with development and DevOps teams to improve system design for reliability
  • Conduct root cause analysis (RCA) and implement preventative measures
  • Ensure high availability by designing and maintaining load balancing, failover, and disaster recovery strategies
  • Improve CI/CD pipelines to enhance deployment speed while maintaining stability
  • Optimize cloud cost and resource utilization for AWS, Azure, or Google Cloud Platform (GCP)
  • Participate in on-call rotations to quickly address system failures and minimize downtime

Requirements For Site Reliability Engineer (SRE)

Linux
Kubernetes
  • 4+ years of experience in Site Reliability Engineering (SRE), DevOps, or System Engineering
  • Strong knowledge of cloud platforms (AWS, Azure, or GCP) and cloud-native architectures
  • Experience with observability and monitoring tools (Prometheus, Grafana, ELK, Datadog, New Relic)
  • Proficiency in Infrastructure as Code (IaC) tools such as Terraform, CloudFormation, or Pulumi
  • Hands-on experience with containerization and orchestration (Docker, Kubernetes, Helm)
  • Strong Linux system administration and networking fundamentals
  • Experience with incident management, debugging, and root cause analysis
  • Proficiency in scripting (Bash, Python, or Go) for automation and system monitoring
  • Knowledge of load balancing, failover strategies, and distributed systems
  • Understanding of security best practices, access control, and compliance requirements
  • Strong communication skills and the ability to collaborate with cross-functional teams

Benefits For Site Reliability Engineer (SRE)

401k
Medical Insurance
Dental Insurance
Vision Insurance
  • Remote-first approach with flexible working hours
  • Apple hardware ecosystem for work
  • Annual Bonus
  • Medical Insurance (including vision & dental)
  • Disability insurance - short and long-term
  • 401k up to 4% contribution
  • Air Stipend of $3,120/year
  • Air Conference 2025 in Las Vegas

Interested in this job?

Jobs Related To Air Apps Site Reliability Engineer (SRE)

Senior Site Reliability / Gitops Engineer

Senior Site Reliability Engineer position at Canonical, focusing on GitOps and infrastructure automation for Ubuntu's parent company.

Site Reliability Engineer

Senior SRE position at Radar, managing high-throughput infrastructure handling 1B+ daily API calls, using AWS, Kubernetes, and MongoDB, with competitive compensation and benefits.

Site Reliability Engineer, AI/ML Platforms

Senior Site Reliability Engineer role at Adobe focusing on AI/ML platforms, requiring 5+ years experience in distributed systems and containerization technologies.

Site Reliability Engineer

Senior Site Reliability Engineer position at Behavox managing high-load distributed systems with 5+ years experience required in DevOps and cloud platforms.

Senior Site Reliability Engineer

Senior Site Reliability Engineer position at Vantage, focusing on infrastructure, reliability, and observability for a cloud cost optimization platform.