Staff Software Engineer, Cloud ML Compute Services

Google Cloud provides organizations with leading infrastructure, platform capabilities, and industry solutions leveraging cutting-edge technology.
$189,000 - $284,000
Machine Learning
Staff Software Engineer
In-Person
5,000+ Employees
8+ years of experience
AI · Enterprise SaaS · Cloud

Description For Staff Software Engineer, Cloud ML Compute Services

Google Cloud is seeking a Staff Software Engineer to join their Cloud ML Compute Services team, focusing on building and supporting the Google Cloud Platform (GCP) Cloud TPU and GPU services. This role is crucial in developing next-generation technologies that impact billions of users' interactions with information and each other. The position involves working with cutting-edge AI infrastructure, specifically on the team responsible for ML frameworks, tools, models, and processes to achieve scale and performance in ML workloads in Google Cloud.

The ideal candidate will have extensive experience in software development, machine learning, and technical leadership. You'll be working on critical projects for Google Cloud's needs, with opportunities to switch teams and projects as the business evolves. The role requires expertise in high-performance computing, particularly with TPUs and GPUs, and deep knowledge of machine learning frameworks like PyTorch and JAX.

As a Staff Software Engineer, you'll be at the forefront of AI infrastructure development, working with both high-level Python and low-level C++ implementations. You'll collaborate with various teams to improve LLM training and inference performance, develop new features, and work directly with Cloud TPU power users to solve complex technical challenges.

The position offers competitive compensation, including a base salary range of $189,000-$284,000, plus bonus, equity, and comprehensive benefits. This is an excellent opportunity for someone passionate about machine learning infrastructure who wants to make a significant impact on how organizations leverage Google's cutting-edge technology for their ML workloads.

Last updated 4 days ago

Responsibilities For Staff Software Engineer, Cloud ML Compute Services

  • Work across the tech stack to improve LLM training and inference performance on TPU
  • Add new features and publish high-performance open-source kernels
  • Partner with the XLA and PyTorch team to design and implement new PyTorch features
  • Collaborate directly with Cloud TPU power users to solve tricky problems and enable new workloads
  • Create smooth inter-operations between JAX and PyTorch
  • Implement and benchmark reference PyTorch models and techniques

Requirements For Staff Software Engineer, Cloud ML Compute Services

Python
  • Bachelor's degree or equivalent practical experience
  • 8 years of experience in software development and with data structures/algorithms
  • 5 years of experience testing, and launching software products
  • 3 years of experience with software design and architecture
  • 5 years of experience with machine learning algorithms, tools, and libraries
  • 3 years of experience in a technical leadership role leading project teams and setting technical direction

Benefits For Staff Software Engineer, Cloud ML Compute Services

Medical Insurance
Dental Insurance
Vision Insurance
Equity
401k
  • Medical Insurance
  • Dental Insurance
  • Vision Insurance
  • Equity
  • 401k

Interested in this job?

Jobs Related To Google Staff Software Engineer, Cloud ML Compute Services

Staff Research Scientist, Google Cloud AI

Lead AI research scientist position at Google Cloud, focusing on advancing AI technology and applications across multiple industries while contributing to the research community.

Senior Research Scientist, Machine Learning Theory

Senior Research Scientist position at Google Research focusing on machine learning theory, algorithm development, and practical applications for Google products.

Senior Research Scientist, Google Cloud AI

Senior Research Scientist position at Google Cloud AI, focusing on advancing AI technology and research across various industries with competitive compensation and benefits.

Product Manager, AI/ML, Google Cloud

Lead AI/ML product strategy at Google Cloud, focusing on XLA compiler and ML infrastructure development while working with teams like DeepMind and YouTube.

Senior Research Scientist, Multilingual NLP

Senior Research Scientist position at Google focusing on multilingual NLP and LLM development, requiring PhD and 7+ years of experience in machine learning and natural language processing.