Principal Engineer, Cloud ML Compute Services

Google

Google Cloud provides enterprise-grade solutions leveraging cutting-edge technology and tools for digital transformation.

Kirkland, WA, USA • Sunnyvale, CA, USA

$278,000 - $399,000

Machine Learning

Principal Software Engineer

In-Person

5,000+ Employees

15+ years of experience

AI · Enterprise SaaS · Cloud

Description For Principal Engineer, Cloud ML Compute Services

Google's Cloud ML Compute Services (CMCS) is seeking a Principal Engineer to drive technical strategy for ML Frameworks and Models. This role focuses on building the best Cloud ML platform for demanding and innovative ML workloads. You'll be responsible for enabling massive scale ML Services powered by GPUs and TPUs, working with cutting-edge AI technologies including LLMs, MoE, and Diffusion models.

The position requires deep expertise in machine learning, distributed systems, and cloud architecture. You'll work on realtime scalability through model/data parallelization, performance tuning, and low latency serving of both first-party and third-party models. The role involves collaboration with various teams across Google, including Core ML, GDM, Storage, GKE, and VertexAI.

As a Principal Engineer, you'll be at the forefront of AI/ML innovation, working with state-of-the-art hardware like Google TPUs and NVIDIA GPUs. You'll drive the technical strategy for large-scale training and inference services on GCP, while working with popular ML frameworks such as PyTorch, JAX, and TensorFlow.

This is an exceptional opportunity for a seasoned engineer to shape the future of cloud-based machine learning infrastructure at one of the world's leading technology companies. The role offers competitive compensation, including a substantial base salary range of $278,000-$399,000, plus bonus, equity, and comprehensive benefits.

The ideal candidate will combine deep technical expertise with strong leadership abilities, capable of building strategic alignments across organizations and delivering innovative solutions that meet the dynamic needs of AI/ML compute. If you're passionate about machine learning, distributed systems, and want to make a significant impact on the future of cloud computing, this role offers an unparalleled opportunity to work with cutting-edge technology at scale.

Last updated a month ago

Responsibilities For Principal Engineer, Cloud ML Compute Services

Design, build, and deploy solutions that leverage GPU, TPU and highly-scalable hardware and software infrastructure
Build strategic alignment with major organizations across Google contributing to the ML landscape
Work across Engineering teams that build, design, and implement both hardware and software
Provide leadership for cloud developer technology inside Google
Optimize the latest emerging ML model types, benchmarks, and common ML frameworks

Requirements For Principal Engineer, Cloud ML Compute Services

Python

Bachelor's degree in Computer Science, Electrical Engineering, or equivalent practical experience
15 years of experience building software and distributed systems
10 years of experience with machine learning algorithms and tools
10 years of experience with hardware and software design, data structures and algorithms
10 years of experience with private and public cloud design
Experience with PyTorch, TensorFlow, JAX
Experience with LLMs, NLP, and deep learning models
Excellent organization, problem-solving, and prioritization skills
Outstanding teamwork and communication skills

Benefits For Principal Engineer, Cloud ML Compute Services

bonus
equity
benefits

Google

Google Cloud provides enterprise-grade solutions leveraging cutting-edge technology and tools for digital transformation.

Kirkland, WA, USA • Sunnyvale, CA, USA

$278,000 - $399,000

Machine Learning

Principal Software Engineer

In-Person

5,000+ Employees

15+ years of experience

AI · Enterprise SaaS · Cloud

Interested in this job?

Jobs Related To Google Principal Engineer, Cloud ML Compute Services

Product Manager, TPU

Google

Lead product strategy and development for Google's Tensor Processing Unit (TPU) Machine Learning infrastructure, working with internal and external customers to drive ML capabilities.

Silicon AI/ML Lead Architect

Google

Lead Architect position focusing on developing AI/ML silicon solutions and accelerators for Google Cloud's data center infrastructure.

Silicon AI/ML Architect, TPU, Google Cloud

Google

Senior Silicon AI/ML Architect position at Google, focusing on TPU architecture and development for next-generation AI hardware acceleration.

Principal Engineer, Generative AI

Google

Principal Engineer position at Google focusing on Generative AI and LLMs to enhance search and personalization technologies, offering $278K-$399K base salary plus benefits.

Silicon AI/ML Lead Architect

Google

Lead the architecture and development of AI/ML accelerators for Google Cloud's data centers, focusing on custom silicon solutions and high-performance computing infrastructure.