Google is seeking a Principal Site Reliability Engineer to lead ML Acceleration initiatives, focusing on optimizing the delivery and implementation of ML resources across their global infrastructure. This role combines deep technical expertise in distributed systems, capacity planning, and ML infrastructure with strategic leadership. You'll be responsible for transforming chips from global fabs into ML supercomputers within gigawatt-scale data centers. The position requires coordinating across Data Center Construction, Networking, and Machine Delivery teams to optimize ML capacity delivery. As part of Google's Technical Infrastructure team, you'll contribute to maintaining and developing next-generation platforms that power Google's extensive product portfolio. The role offers competitive compensation, including a robust benefits package, and the opportunity to work on large-scale, impactful projects that shape the future of Google's ML infrastructure.