Principal Engineer, Cloud ML Compute Services

Google Cloud provides enterprise-grade solutions leveraging cutting-edge technology and tools for digital transformation.
$278,000 - $399,000
Machine Learning
Principal Software Engineer
In-Person
5,000+ Employees
15+ years of experience
AI · Enterprise SaaS · Cloud

Description For Principal Engineer, Cloud ML Compute Services

Google's Cloud ML Compute Services (CMCS) is seeking a Principal Engineer to drive technical strategy for ML Frameworks and Models. This role focuses on building the best Cloud ML platform for demanding and innovative ML workloads. You'll be responsible for enabling massive scale ML Services powered by GPUs and TPUs, working with cutting-edge AI technologies including LLMs, MoE, and Diffusion models.

The position requires deep expertise in machine learning, distributed systems, and cloud architecture. You'll work on realtime scalability through model/data parallelization, performance tuning, and low latency serving of both first-party and third-party models. The role involves collaboration with various teams across Google, including Core ML, GDM, Storage, GKE, and VertexAI.

As a Principal Engineer, you'll be at the forefront of AI/ML innovation, working with state-of-the-art hardware like Google TPUs and NVIDIA GPUs. You'll drive the technical strategy for large-scale training and inference services on GCP, while working with popular ML frameworks such as PyTorch, JAX, and TensorFlow.

This is an exceptional opportunity for a seasoned engineer to shape the future of cloud-based machine learning infrastructure at one of the world's leading technology companies. The role offers competitive compensation, including a substantial base salary range of $278,000-$399,000, plus bonus, equity, and comprehensive benefits.

The ideal candidate will combine deep technical expertise with strong leadership abilities, capable of building strategic alignments across organizations and delivering innovative solutions that meet the dynamic needs of AI/ML compute. If you're passionate about machine learning, distributed systems, and want to make a significant impact on the future of cloud computing, this role offers an unparalleled opportunity to work with cutting-edge technology at scale.

Last updated a month ago

Responsibilities For Principal Engineer, Cloud ML Compute Services

  • Design, build, and deploy solutions that leverage GPU, TPU and highly-scalable hardware and software infrastructure
  • Build strategic alignment with major organizations across Google contributing to the ML landscape
  • Work across Engineering teams that build, design, and implement both hardware and software
  • Provide leadership for cloud developer technology inside Google
  • Optimize the latest emerging ML model types, benchmarks, and common ML frameworks

Requirements For Principal Engineer, Cloud ML Compute Services

Python
  • Bachelor's degree in Computer Science, Electrical Engineering, or equivalent practical experience
  • 15 years of experience building software and distributed systems
  • 10 years of experience with machine learning algorithms and tools
  • 10 years of experience with hardware and software design, data structures and algorithms
  • 10 years of experience with private and public cloud design
  • Experience with PyTorch, TensorFlow, JAX
  • Experience with LLMs, NLP, and deep learning models
  • Excellent organization, problem-solving, and prioritization skills
  • Outstanding teamwork and communication skills

Benefits For Principal Engineer, Cloud ML Compute Services

  • bonus
  • equity
  • benefits

Interested in this job?

Jobs Related To Google Principal Engineer, Cloud ML Compute Services

Product Manager, TPU

Lead product strategy and development for Google's Tensor Processing Unit (TPU) Machine Learning infrastructure, working with internal and external customers to drive ML capabilities.

Silicon AI/ML Lead Architect

Lead Architect position focusing on developing AI/ML silicon solutions and accelerators for Google Cloud's data center infrastructure.

Silicon AI/ML Architect, TPU, Google Cloud

Senior Silicon AI/ML Architect position at Google, focusing on TPU architecture and development for next-generation AI hardware acceleration.

Principal Engineer, Generative AI

Principal Engineer position at Google focusing on Generative AI and LLMs to enhance search and personalization technologies, offering $278K-$399K base salary plus benefits.

Silicon AI/ML Lead Architect

Lead the architecture and development of AI/ML accelerators for Google Cloud's data centers, focusing on custom silicon solutions and high-performance computing infrastructure.