Principal Site Reliability Engineer

UltraViolet Cyber

Leading platform-enabled unified security operations company providing comprehensive security operations solutions.

McLean, VA, USA

$170,000 - $200,000

Site Reliability

Principal Software Engineer

Remote

101 - 500 Employees

8+ years of experience

Cybersecurity · Enterprise SaaS

Description For Principal Site Reliability Engineer

UltraViolet Cyber, a leading unified security operations company, is seeking a Principal Site Reliability Engineer to join their team. This role combines advanced technical expertise with leadership responsibilities, focusing on enhancing the scalability, reliability, and security of cloud infrastructure. The position offers an opportunity to work with cutting-edge technologies in cybersecurity, particularly with Amazon EKS and AWS services.

The role demands expertise in Kubernetes, DevOps practices, and cloud infrastructure, with responsibilities ranging from system reliability management to cost optimization. You'll be working with a comprehensive tech stack including Kubernetes, Python, and various AWS services, while implementing security best practices and maintaining high-availability systems.

The company provides an attractive benefits package including 401(k) matching, comprehensive health insurance, and flexible time off. Based in McLean, Virginia, with global offices across the U.S. and India, UltraViolet Cyber serves Fortune 500 companies and Federal Government clients, offering a platform that combines technology innovation with human expertise.

This position is ideal for a seasoned professional who enjoys solving complex technical challenges, mentoring others, and working in a fast-paced environment. The role offers competitive compensation ($170,000-$200,000) and the flexibility of remote work, making it an excellent opportunity for experienced SREs looking to make a significant impact in the cybersecurity space.

Last updated 12 days ago

Responsibilities For Principal Site Reliability Engineer

Ensure availability, performance, scalability, and security of cloud-based services
Architect, deploy, and maintain Kubernetes clusters using Amazon EKS
Automate infrastructure provisioning using IaC tools
Build and maintain CI/CD pipelines
Design and implement monitoring, alerting, and logging solutions
Enforce security best practices and compliance
Conduct capacity planning and scaling
Lead cross-functional collaboration
Manage incidents and perform root cause analysis
Optimize cloud costs while maintaining performance

Requirements For Principal Site Reliability Engineer

Kubernetes

Python

Extensive experience in AWS, particularly with EKS clusters
Strong proficiency in Kubernetes ecosystem
Hands-on experience with DevOps tools & methodologies
Proficiency in Python, Bash, or Golang
Experience with observability and monitoring tools
Deep understanding of networking principles
Strong background in security best practices
Experience with highly available, distributed systems
Previous experience in Agile or DevOps culture
Excellent troubleshooting skills
Strong communication and leadership skills
Bachelor's degree in Computer Science, Engineering, or related field

Benefits For Principal Site Reliability Engineer

401k

Medical Insurance

Dental Insurance

Vision Insurance

401(k) with employer match of 100% of first 3% and 50% of next 2%
Medical, Dental, and Vision Insurance
Group Term Life Insurance
Short-Term Disability
Long-Term Disability
Discretionary Time Off (DTO) Program
11 Paid Holidays Annually

UltraViolet Cyber

Leading platform-enabled unified security operations company providing comprehensive security operations solutions.

McLean, VA, USA

$170,000 - $200,000

Site Reliability

Principal Software Engineer

Remote

101 - 500 Employees

8+ years of experience

Cybersecurity · Enterprise SaaS

Interested in this job?

Jobs Related To UltraViolet Cyber Principal Site Reliability Engineer

Systems Engineering Principal

Salesforce

Principal Engineer position at Salesforce focusing on system reliability, incident analysis, and driving technical improvements across cloud platforms.

Director, Software Engineering, Site Reliability

Lead LinkedIn's Site Reliability Engineering team, directing 40+ engineers in managing critical infrastructure systems while driving innovation and reliability improvements.

Engineering Director, P2020 Rollouts

Google

Lead the strategy and development of Google's Rollouts production platform, managing continuous deployment solutions for Alphabet and Google services.

Principal Site Reliability Engineer, ML Capacity Planning, Acceleration

Google

Lead ML infrastructure optimization and capacity planning at Google as Principal SRE, managing global teams and strategic initiatives across 20+ countries.

Principal Engineer, AI, Trust, Security, Site Reliability Engineering

Google

Lead AI platform development and security initiatives as a Principal Engineer at Google, architecting reliable and secure distributed systems for cloud AI infrastructure.