Staff Engineer, Site Reliability

LinkedIn is the world's largest professional network, built to help members achieve more in their careers.
$147,000 - $240,000
Site Reliability
Staff Software Engineer
Hybrid
5+ years of experience
Enterprise SaaS

Description For Staff Engineer, Site Reliability

LinkedIn, the world's largest professional network, is seeking a Staff Site Reliability Engineer to join their Traffic team. This role is crucial in delivering LinkedIn's products and services to over 1 Billion members worldwide, handling millions of requests per second.

The Traffic team operates at the edge of LinkedIn's data centers, managing a massive infrastructure that includes Layer 4 and Layer 7 network proxies, load balancers, service discovery, monitoring, and CI/CD pipelines. As a Staff SRE, you'll be responsible for the health, performance, and capacity of critical Internet-facing services.

This position offers an exciting opportunity to work on highly visible projects that directly impact LinkedIn's site reliability and performance. You'll be developing tools and systems to improve deployment capabilities and monitoring in a large-scale Linux environment. The role requires deep technical expertise in systems operations, networking, and programming, with hands-on experience in technologies like Go, Python, Java, and Kubernetes.

The ideal candidate will have 5+ years of experience in UNIX-based large-scale web operations, strong programming skills, and excellent communication abilities. You'll work in a collaborative environment, partnering with various teams to ensure platform operability and scalability.

LinkedIn offers a hybrid work environment, allowing you to balance remote work with office time in Sunnyvale, CA. You'll be part of a team that values innovation, collaboration, and technical excellence, while contributing to LinkedIn's mission of creating economic opportunity for every member of the global workforce.

Key responsibilities include managing service health and capacity, developing deployment tools, implementing monitoring solutions, and participating in on-call rotations. The role requires both technical depth and the ability to work effectively with cross-functional teams.

If you're passionate about large-scale systems, enjoy solving complex technical challenges, and want to make an impact on a platform that serves billions of requests, this role offers an excellent opportunity to advance your career in site reliability engineering at one of the world's leading professional networks.

Last updated 11 days ago

Responsibilities For Staff Engineer, Site Reliability

  • Serve as primary point responsible for overall health, performance, and capacity of Internet-facing services
  • Develop and assist in roll-out and deployment of new product features
  • Develop tools to improve deployment and monitoring capabilities
  • Work closely with partner teams on platform operability
  • Participate in 24x7 rotation for second-tier escalations

Requirements For Staff Engineer, Site Reliability

Go
Python
Java
Linux
Kubernetes
  • B.S. or higher in Computer Science or related technical discipline
  • 2+ years experience with operating and troubleshooting Linux at scale
  • Programming skills in Go, C/C++, Python, Rust, Java
  • 5+ years in UNIX-based large-scale web operations role
  • Experience with reverse proxies / load balancers
  • Strong interpersonal communication skills
  • Knowledge of data structures, databases, networking, Linux internals

Benefits For Staff Engineer, Site Reliability

  • Hybrid work option

Interested in this job?

Jobs Related To LinkedIn Staff Engineer, Site Reliability

Staff Engineer, Site Reliability

Staff Engineer, Site Reliability at LinkedIn - Develop and manage large-scale infrastructure for LinkedIn's edge services.

Staff Engineer, Site Reliability

Join LinkedIn as a Staff Engineer in Site Reliability, managing large-scale infrastructure and improving service delivery for over 1 Billion members.

Technical Program Manager, Site Reliability Engineering

Technical Program Manager position at Google leading SRE initiatives, requiring 5+ years of program management experience and strong technical expertise.

Software Engineering Manager II, Site Reliability Engineering

Lead Google's Site Reliability Engineering team in building and maintaining large-scale distributed systems, managing technical projects, and ensuring service reliability.

Software Engineering Manager II, Site Reliability Engineering, Google Cloud

Lead Site Reliability Engineering team at Google Cloud, managing distributed systems and ensuring service reliability at global scale.