Lead Site Reliability Engineer

Corelight is a cybersecurity company that transforms network and cloud activity into evidence for elite defenders to proactively hunt threats and accelerate response to cyber incidents.
North, SC 29112, USA
$180,000 - $225,000
Site Reliability
Staff Software Engineer
Contact Company
8+ years of experience
Cybersecurity

Description For Lead Site Reliability Engineer

Corelight is a cybersecurity company that transforms network and cloud activity into evidence. As a Lead Site Reliability Engineer (SRE), you will ensure the stability, performance, and security of our Federal region's cloud platform. You'll manage infrastructure and operations with a focus on availability, latency, performance optimization, monitoring, incident response, and capacity planning. This role requires maintaining a FedRAMP-compliant environment and working closely with teams to meet the highest standards of security and compliance.

Key Responsibilities:

  • Collaborate with software engineering teams to ensure reliability, performance, and security of Federal region's infrastructure.
  • Design, implement, and manage FedRAMP-compliant infrastructure and systems.
  • Establish continuous monitoring, logging, and auditing processes for FedRAMP compliance.
  • Partner with security teams for assessments and control implementation.
  • Design scalable infrastructure solutions supporting multi-region growth.
  • Drive automation efforts for efficient, compliant infrastructure scaling.
  • Stay updated on best practices, security threats, and FedRAMP guidelines.
  • Deploy and maintain resilient cloud-native services in AWS.
  • Participate in 24x7 incident response and on-call rotations.
  • Plan for capacity and platform growth.

Technical Skills Required:

  • 8+ years of experience with FedRAMP environments or similar regulated systems.
  • Expertise in AWS services (EC2, S3, RDS, Lambda, ECS/EKS, Glue, EMR, Redshift, OpenSearch, VPC).
  • Deep understanding of FedRAMP framework, controls, and compliance requirements.
  • Proficiency in Python, Go, or Java.
  • Experience with big data technologies (Hadoop, Spark, Kafka).
  • Strong skills in Infrastructure as Code tools (Terraform, CloudFormation, Ansible).
  • Knowledge of containerization and orchestration (Docker, Kubernetes).
  • Experience with CI/CD tools (Jenkins, GitLab CI, CircleCI).
  • Track record in building high-availability, resilient platforms with strict SLOs.
  • Strong experience with Unix/Linux systems and AWS.

Additional Requirements:

  • U.S. citizenship at time of hire.
  • Residence within the contiguous United States.
  • Willingness to undergo a Single Scope Background Investigation, if required.

Join a team that values collaboration, innovation, and excellence, working on cutting-edge projects to secure critical organizations worldwide.

Last updated 3 months ago

Responsibilities For Lead Site Reliability Engineer

  • Collaborate with software engineering teams to ensure the reliability, performance, and security of the Federal region's infrastructure
  • Design, implement, and manage FedRAMP-compliant infrastructure and systems
  • Establish continuous monitoring, logging, and auditing processes to ensure compliance with FedRAMP controls
  • Partner with security teams to conduct security assessments and implement necessary controls
  • Design and implement scalable infrastructure solutions that support multi-region growth
  • Drive automation efforts, enabling infrastructure and platforms to scale efficiently with a focus on compliance
  • Stay up-to-date on best practices, evolving security threats, and FedRAMP guidelines to maintain a strong security posture
  • Deploy and maintain cloud-native services in AWS that are resilient and elastic
  • Participate in 24x7 incident response and on-call rotations
  • Plan for capacity and work with teams to prepare for platform growth

Requirements For Lead Site Reliability Engineer

Python
Go
Java
Kubernetes
Linux
  • 8+ years of experience building and operating FedRAMP environments or similarly regulated systems
  • Expertise in AWS services (e.g., EC2, S3, RDS, Lambda, ECS/EKS, Glue, EMR, Redshift, OpenSearch, VPC)
  • Deep understanding of the FedRAMP framework, controls, and compliance requirements
  • Proficiency in programming languages such as Python, Go, or Java
  • Experience with big data technologies (Hadoop, Spark, Kafka)
  • Strong skills in Infrastructure as Code (IaC) tools like Terraform, CloudFormation, or Ansible
  • Knowledge of containerization and orchestration tools like Docker and Kubernetes
  • Experience with CI/CD tools such as Jenkins, GitLab CI, or CircleCI
  • Proven track record in building and scaling platforms with high availability, resilience, and strict SLO objectives
  • Strong experience with Unix/Linux systems and cloud providers, ideally AWS
  • U.S. citizenship at the time of hire
  • Residence within the contiguous United States
  • Willingness to undergo a Single Scope Background Investigation, if required

Interested in this job?

Jobs Related To Corelight Lead Site Reliability Engineer

Lead Site Reliability Engineer (SRE)

Lead Site Reliability Engineer position at Corelight, focusing on FedRAMP-compliant infrastructure and cloud platform management for cybersecurity solutions.

Sr Staff Software Engineer, Reliability Engineering

Senior Staff SRE position at Airbnb focusing on reliability architecture, incident management, and technical leadership, offering competitive compensation and remote work flexibility.

Staff Software Engineer, Reliability Engineering

Staff Software Engineer position at Airbnb focusing on Site Reliability Engineering, developing and maintaining tools for service reliability at scale.

Technical Program Manager III, Site Reliability, Storage

Technical Program Manager III position at Google, leading Storage Site Reliability Engineering initiatives and cross-functional programs.

Software Engineering Manager II, Site Reliability Engineering

Lead Google's Site Reliability Engineering team, managing distributed systems and ensuring service reliability while driving technical innovation and team development.