Site Reliability Engineering (SRE) at YouTube combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. As a Software Engineer III in SRE, you'll ensure YouTube's services have reliability and uptime appropriate to customer needs, while maintaining a fast rate of improvement. Your role involves managing complex challenges of scale unique to YouTube, using expertise in coding, algorithms, complexity analysis, and large-scale system design.
Key responsibilities include:
The role requires a Bachelor's degree in Computer Science or related field (or equivalent experience) and at least 2 years of experience with data structures/algorithms and software development. Preferred qualifications include experience in distributed systems, storage, or networking, expertise in designing and troubleshooting large-scale distributed systems, and strong problem-solving and communication skills.
Join YouTube's SRE team to work on meaningful projects in a diverse, intellectually curious environment that encourages collaboration, big thinking, and risk-taking. You'll have the opportunity to learn, grow, and contribute to keeping YouTube's infrastructure running smoothly for millions of users worldwide.