Customer Reliability Engineer - Infra

Industry-leading DataOps platform provider powered by Airflow, helping data teams build reliable data products for analytics and AI.
$130,000 - $150,000
DevOps
Senior Software Engineer
Remote
4+ years of experience
Enterprise SaaS · AI

Description For Customer Reliability Engineer - Infra

Astronomer, the company behind Astro, an industry-leading DataOps platform powered by Airflow, is seeking a Customer Reliability Engineer for their infrastructure team. This role is crucial in ensuring the reliability and performance of their managed Airflow service across multiple cloud platforms. As a CRE, you'll be responsible for operating and maintaining platform infrastructure on AWS, Azure, and GCP, working directly with customers to ensure their success.

The position offers a unique opportunity to develop expertise in cloud engineering, Kubernetes, and modern infrastructure while working with a sophisticated cloud-native product. You'll spend up to 25% of your time on impactful side projects, such as contributing to open-source Airflow or developing internal monitoring systems. The role requires working during core hours of 9AM-3PM Eastern US time, with flexible scheduling for remaining hours.

This is an ideal position for someone passionate about infrastructure, problem-solving, and customer success. You'll be joining a globally-distributed, venture-backed team focused on empowering data teams to deliver mission-critical analytics and AI solutions. The company values diverse experiences and unconventional career paths, fostering an inclusive environment for learners and innovators.

The role combines technical expertise with customer interaction, requiring strong communication skills and the ability to manage complex infrastructure challenges. You'll be part of a team that maintains 24/7 platform availability, participating in on-call rotations and working directly with customers to meet SLAs. The position offers competitive compensation, including equity, and the flexibility of remote work with occasional in-person events.

Last updated 2 days ago

Responsibilities For Customer Reliability Engineer - Infra

  • Operate, monitor, and maintain the platform for availability and reliability
  • Work with customers' data engineers and DevOps teams
  • Provide platform support and meet SLAs
  • Participate in 24x7 coverage through 6-hour pager period
  • Participate in paid on-call rotation for weekend coverage
  • Work on side projects contributing to Astronomer's success
  • Maintain infrastructure across AWS, Azure, and GCP

Requirements For Customer Reliability Engineer - Infra

Kubernetes
Linux
  • 4 years of professional experience
  • Experience with Kubernetes/Docker/Containers
  • Experience with major cloud providers (AWS, GCP, Azure)
  • Demonstrable Linux familiarity
  • Excellent written and verbal communication skills
  • Problem-solving and troubleshooting abilities
  • Ability to work 9AM-3PM Eastern US, Monday to Friday

Benefits For Customer Reliability Engineer - Infra

  • Equity compensation
  • Remote-first work environment
  • 2-4 in-person events per year

Interested in this job?

Jobs Related To Astronomer Customer Reliability Engineer - Infra

Field Engineer, SWAT

Senior Field Engineer position at Astronomer focusing on implementing Apache Airflow solutions, requiring strong Kubernetes, cloud infrastructure, and customer-facing skills.

Airflow Reliability Engineer

Senior Airflow Reliability Engineer position at Astronomer, focusing on customer support and Apache Airflow expertise, offering remote work and competitive compensation.

Field Engineer, SWAT

Senior Field Engineer position at Astronomer, implementing and supporting Apache Airflow solutions for enterprise clients, requiring strong DevOps and customer-facing skills.

Customer Reliability Engineer - Infra

Senior DevOps Engineer role focused on customer reliability engineering, managing Kubernetes and cloud infrastructure across multiple platforms.

Customer Reliability Engineer - Infra

Senior DevOps position focusing on Kubernetes and cloud infrastructure management, ensuring platform reliability and customer success in a remote-first environment.