Google's Cloud ML Compute Services (CMCS) is seeking a Principal Engineer to drive technical strategy for ML Frameworks and Models. This role focuses on building the best Cloud ML platform for demanding and innovative ML workloads. You'll be responsible for enabling massive scale ML Services powered by GPUs and TPUs, working with cutting-edge AI technologies including LLMs, MoE, and Diffusion models.
The position requires deep expertise in machine learning, distributed systems, and cloud architecture. You'll work on realtime scalability through model/data parallelization, performance tuning, and low latency serving of both first-party and third-party models. The role involves collaboration with various teams across Google, including Core ML, GDM, Storage, GKE, and VertexAI.
As a Principal Engineer, you'll be at the forefront of AI/ML innovation, working with state-of-the-art hardware like Google TPUs and NVIDIA GPUs. You'll drive the technical strategy for large-scale training and inference services on GCP, while working with popular ML frameworks such as PyTorch, JAX, and TensorFlow.
This is an exceptional opportunity for a seasoned engineer to shape the future of cloud-based machine learning infrastructure at one of the world's leading technology companies. The role offers competitive compensation, including a substantial base salary range of $278,000-$399,000, plus bonus, equity, and comprehensive benefits.
The ideal candidate will combine deep technical expertise with strong leadership abilities, capable of building strategic alignments across organizations and delivering innovative solutions that meet the dynamic needs of AI/ML compute. If you're passionate about machine learning, distributed systems, and want to make a significant impact on the future of cloud computing, this role offers an unparalleled opportunity to work with cutting-edge technology at scale.