Staff Software Engineer, Cloud ML Compute Services

Google Cloud provides organizations with leading infrastructure, platform capabilities, and industry solutions leveraging cutting-edge technology.
$189,000 - $284,000
Machine Learning
Staff Software Engineer
In-Person
5,000+ Employees
8+ years of experience
AI · Enterprise SaaS · Cloud

Description For Staff Software Engineer, Cloud ML Compute Services

Google Cloud is seeking a Staff Software Engineer to join their Cloud ML Compute Services team, focusing on building and supporting Google Cloud Platform's Cloud TPU and GPU services. This role is critical in developing next-generation technologies that impact billions of users' interactions with information and each other. The position requires expertise in machine learning infrastructure, working with frameworks like PyTorch and JAX, and optimizing performance for ML workloads.

The role involves working across the full technology stack, from high-level Python to low-level C++, to improve LLM training and inference performance on TPU. You'll be responsible for implementing new features, publishing high-performance kernels, and collaborating with various teams to enhance PyTorch capabilities and enable new workloads.

As a Staff Software Engineer, you'll be part of Google Cloud's mission to accelerate digital transformation across industries. The team provides enterprise-grade solutions leveraging Google's cutting-edge technology, serving customers in more than 200 countries. You'll work with state-of-the-art ML supercomputers and have the opportunity to impact how organizations access and utilize machine learning capabilities at scale.

The position offers competitive compensation, including a base salary range of $189,000-$284,000, plus bonus, equity, and comprehensive benefits. You'll be working with industry-leading professionals, having the chance to shape the future of machine learning infrastructure while solving complex technical challenges that affect global users.

This role is perfect for someone who is passionate about machine learning, has strong technical leadership experience, and wants to work at the forefront of AI infrastructure development. You'll have the opportunity to influence technical direction, mentor team members, and contribute to Google Cloud's mission of enabling organizations to leverage cutting-edge ML capabilities efficiently and effectively.

Last updated 17 minutes ago

Responsibilities For Staff Software Engineer, Cloud ML Compute Services

  • Work across the tech stack to improve LLM training and inference performance on TPU
  • Add new features and publish high-performance open-source kernels
  • Partner with the XLA and PyTorch team to design and implement new PyTorch features
  • Collaborate directly with Cloud TPU power users to solve tricky problems and enable new workloads
  • Create smooth inter-operations between JAX and PyTorch
  • Implement and benchmark reference PyTorch models and techniques

Requirements For Staff Software Engineer, Cloud ML Compute Services

Python
  • Bachelor's degree or equivalent practical experience
  • 8 years of experience in software development and with data structures/algorithms
  • 5 years of experience testing, and launching software products
  • 3 years of experience with software design and architecture
  • 5 years of experience with machine learning algorithms, tools, and libraries
  • Experience with building high-quality and reusable AI infrastructure, compilers, or performance engineering
  • Experience with stack-spanning systems and tools, from high-level Python to low-level C++
  • Understanding of the full user experience

Benefits For Staff Software Engineer, Cloud ML Compute Services

Medical Insurance
Dental Insurance
Vision Insurance
  • bonus
  • equity
  • benefits

Interested in this job?

Jobs Related To Google Staff Software Engineer, Cloud ML Compute Services

Engineering Manager, AI/ML Infrastructure, Google Compute Engine

Lead AI/ML infrastructure development and engineering team management for Google Compute Engine, focusing on scalable cloud solutions.

Staff Software Engineer, ML Hardware, YouTube Discovery

Staff Software Engineer position at Google focusing on ML Hardware optimization for YouTube Discovery, working with TPUs and large-scale recommendation systems.

Staff Software Engineer, Search Modeling and Quality

Lead ML engineering role at Google focusing on Search quality and modeling, building AI-powered educational features across major Google products.

Customer Engineer, AI Infrastructure, Google Cloud

Lead AI infrastructure solutions at Google Cloud, helping enterprises optimize their AI/ML workloads using cutting-edge accelerators and cloud technology.

Customer Engineer III, AI/ML, Google Cloud

Customer Engineer III position at Google Cloud focusing on AI/ML solutions, requiring 10 years of experience in cloud architecture and machine learning.