Fidelity Investments is seeking a Principal Site Reliability Engineer to build and operate highly resilient platforms in AWS cloud environments. This role involves coordinating systems using Infrastructure as Code tools, performing reliability engineering throughout the SDLC, and deploying distributed multi-tiered applications using Kubernetes and CI/CD pipelines.
The ideal candidate will create and maintain dashboards to capture application performance metrics using tools like Splunk, Grafana, Prometheus, and Datadog. They will be responsible for creating SLI/SLO dashboards, identifying and resolving application issues, and supporting applications hosted in AWS Cloud and Kubernetes.
Key responsibilities include providing automated solutions for operational activities, analyzing application observability and performance, conducting root cause analysis, and ensuring business continuity. The role requires expertise in site reliability engineering, Kubernetes platforms, and automation tools.
Requirements include a Bachelor's degree in Computer Science or related field with 5 years of experience, or a Master's degree with 3 years of experience. The candidate must have demonstrated expertise in site reliability engineering, Kubernetes platforms, and automation tools.
Fidelity offers a comprehensive benefits package including 401(k) with company match, medical coverage, parental leave, and student loan assistance. The position is based in Westlake, TX with a hybrid working model requiring onsite presence every other week.