Principal Site Reliability Engineer

Fidelity Investments

A privately held financial services company focused on making financial expertise broadly accessible and effective in helping people live the lives they want.

Austin, TX, USA

Site Reliability

Principal Software Engineer

In-Person

5+ years of experience

Finance · Enterprise SaaS

Description For Principal Site Reliability Engineer

Fidelity Investments is seeking a Principal Site Reliability Engineer to build and operate highly resilient platforms in AWS cloud environments. This role involves coordinating systems using Infrastructure as Code tools, performing reliability engineering throughout the SDLC, and deploying distributed multi-tiered applications using Kubernetes and CI/CD pipelines.

The ideal candidate will create and maintain dashboards to capture application performance metrics using tools like Splunk, Grafana, Prometheus, and Datadog. They will be responsible for creating SLI/SLO dashboards, identifying and resolving application issues, and supporting applications hosted in AWS Cloud and Kubernetes.

Key responsibilities include providing automated solutions for operational activities, analyzing application observability and performance, conducting root cause analysis, and ensuring business continuity. The role requires expertise in site reliability engineering, Kubernetes platforms, and automation tools.

Requirements include a Bachelor's degree in Computer Science or related field with 5 years of experience, or a Master's degree with 3 years of experience. The candidate must have demonstrated expertise in site reliability engineering, Kubernetes platforms, and automation tools.

Fidelity offers a comprehensive benefits package including 401(k) with company match, medical coverage, parental leave, and student loan assistance. The position is based in Westlake, TX with a hybrid working model requiring onsite presence every other week.

Last updated 13 days ago

Responsibilities For Principal Site Reliability Engineer

Build and operate resilient platforms in AWS cloud environments
Create and maintain performance monitoring dashboards
Perform reliability engineering throughout the SDLC
Deploy and support distributed multi-tiered applications
Provide automated solutions for operational activities
Conduct root cause analysis and resolve critical issues
Manage application scalability and resiliency
Mentor junior team members

Requirements For Principal Site Reliability Engineer

Python

Kubernetes

Redis

Java

Node.js

Bachelor's or Master's degree in Computer Science, Engineering, or related field
5 years experience (with Bachelor's) or 3 years (with Master's) as Principal Site Reliability Engineer
Expertise in site reliability engineering and performance analysis
Experience with Kubernetes platforms and cloud environments
Knowledge of monitoring tools like Splunk, Grafana, Prometheus, and Datadog
Proficiency in Python, Shell Scripting, GIT, Docker

Benefits For Principal Site Reliability Engineer

401k

Medical Insurance

Dental Insurance

Vision Insurance

Parental Leave

401(k) with company match
Medical, dental, vision and prescription drug coverage
16-week maternity leave & 12-week parental leave
Student loan assistance

Fidelity Investments

A privately held financial services company focused on making financial expertise broadly accessible and effective in helping people live the lives they want.

Austin, TX, USA

Site Reliability

Principal Software Engineer

In-Person

5+ years of experience

Finance · Enterprise SaaS

Interested in this job?

Jobs Related To Fidelity Investments Principal Site Reliability Engineer

Principal Site Reliability Engineer

Fidelity Investments

Principal Site Reliability Engineer role at Fidelity Investments focusing on building and maintaining scalable, reliable infrastructure using cloud and DevOps practices.

Director, Software Engineering, Site Reliability

Lead LinkedIn's Site Reliability Engineering team of 40+ engineers, driving infrastructure reliability and automation while ensuring system scalability and performance.

Principal Engineer, AI, Trust, Security, Site Reliability Engineering

Google

Principal Engineer position at Google focusing on AI, security, and site reliability engineering, leading technical initiatives for cloud platform infrastructure.

Director, Software Engineering, Site Reliability

Lead LinkedIn's Site Reliability Engineering team of 40+ engineers, driving infrastructure reliability and innovation for the world's largest professional network.

Principal Site Reliability Development Engineer

Oracle

Principal SRE role at Oracle Cloud Infrastructure focusing on sovereign cloud operations and automation for government systems in Singapore.