Senior Software Engineer, TPU Supercomputer

Google builds and maintains the technical infrastructure that powers its diverse product portfolio and services.
Distributed Systems
Senior Software Engineer
Contact Company
5,000+ Employees
5+ years of experience
AI

Description For Senior Software Engineer, TPU Supercomputer

Google is seeking a Senior Software Engineer to join their TPU Supercomputer team, focusing on developing and maintaining critical infrastructure for AI computing systems. This role sits at the intersection of distributed systems, hardware, and artificial intelligence, where you'll be responsible for designing and maintaining software for TPU supercomputing systems.

The position requires expertise in system software development with C++ and distributed systems, working within Google's Technical Infrastructure team. You'll be managing the complete lifecycle of both computing and networking components for Google's AI supercomputer/hypercomputers, while creating robust debugging and observability tools.

This is an exceptional opportunity to work with cutting-edge technology in AI infrastructure, collaborating with various specialized teams including Silicon, Software, SRE, and Operations. You'll be directly involved in shaping the future of Google's AI computing capabilities, ensuring reliability and performance across the entire TPU stack.

The ideal candidate should have strong foundations in computer science with at least 5 years of C++ development experience and 3 years in distributed systems. Additional expertise in cloud platforms, machine learning frameworks, and networking technologies would be valuable. This role offers the chance to work on sophisticated technical challenges while contributing to Google's next-generation AI infrastructure.

Working at Google's Taipei office, you'll be part of a global team that takes pride in building and maintaining the architecture that powers Google's extensive product portfolio. The role combines deep technical expertise with collaborative problem-solving, making it perfect for engineers who enjoy working on complex systems at scale.

Last updated 11 days ago

Responsibilities For Senior Software Engineer, TPU Supercomputer

  • Design and maintain TPU supercomputer software across different layers of the software stack to control, monitor, build, deploy, qualify and serve the TPU supercomputing systems
  • Manage the whole lifecycle across both computing and networking components for Google's AI supercomputer/hypercomputers
  • Create system-level debuggability and observability tools in partnership with key stakeholders, ensuring alignment and functionality across the stack
  • Collaborate with Silicon, Software, Site Reliability Engineer (SRE), and Operations teams to drive reliability improvements and address issues across the entire TPU stack

Requirements For Senior Software Engineer, TPU Supercomputer

  • Bachelor's degree or equivalent practical experience
  • 5 years of experience in system software development with C++
  • 3 years of experience in distributed systems
  • Master's degree or PhD in Computer Science or related technical field (preferred)
  • Experience with production monitoring, logging, and observability tools (preferred)
  • Experience with cloud platforms and technologies (e.g. GCP) (preferred)
  • Experience with machine learning concepts and frameworks (e.g., TensorFlow) (preferred)
  • Experience with data analysis and SQL (preferred)
  • Knowledge of networking protocols and technologies (preferred)

Interested in this job?

Jobs Related To Google Senior Software Engineer, TPU Supercomputer

Senior Software Engineer, Infrastructure, Google Ads

Senior Software Engineer position at Google working on infrastructure for Google Ads, focusing on large-scale distributed systems development.

Senior Software Engineer, D-SDN, Google Global Networking

Senior Software Engineer position at Google focusing on D-SDN and global networking, developing distributed networking applications and supporting developer ecosystem.

Senior Software Engineer, Infrastructure, Core

Senior Software Engineer position at Google's Core team, focusing on infrastructure and distributed systems development with competitive compensation and benefits.

Senior Software Engineer, Project Starline

Senior Software Engineer position at Google working on Project Starline, developing revolutionary 3D communication technology that enables life-like virtual presence.

Senior Systems Research Engineer

Senior Systems Research Engineer role at Google focusing on next-generation technologies in cloud computing and distributed systems.