Software Engineer, TPU, Machine Learning Supercomputer

Google develops next-generation technologies that change how billions of users connect, explore, and interact with information.
Machine Learning
Entry-Level Software Engineer
In-Person
5,000+ Employees
1+ year of experience
AI

Description For Software Engineer, TPU, Machine Learning Supercomputer

Google is seeking a Software Engineer to join their Technical Infrastructure team, focusing on TPU (Tensor Processing Unit) and Machine Learning Supercomputer systems. This role combines software engineering with cutting-edge AI infrastructure development. You'll be responsible for designing and maintaining TPU supercomputer software, managing AI computing systems, and creating debugging tools. The position requires expertise in software development, distributed systems, and machine learning concepts.

The role offers an opportunity to work on critical projects that power Google's massive-scale operations. You'll collaborate with various teams including Silicon, Software, Site Reliability, and Operations to ensure the reliability and performance of TPU systems. The position involves full-stack development across different layers of the software stack, from system-level tools to high-level applications.

As part of Google's Technical Infrastructure team, you'll be at the forefront of maintaining and developing the architecture that keeps Google's product portfolio running. The team takes pride in being "engineers' engineers" and focuses on building and maintaining the next generation of Google platforms. This role is perfect for someone who wants to combine software engineering expertise with machine learning infrastructure development at one of the world's leading technology companies.

The ideal candidate should have experience in software development, understanding of distributed systems, and knowledge of machine learning algorithms. You'll be working in an environment that values innovation, technical excellence, and collaboration. Google offers a supportive workplace culture that emphasizes diversity, inclusion, and equal opportunity for all employees.

Last updated a day ago

Responsibilities For Software Engineer, TPU, Machine Learning Supercomputer

  • Design and maintain Tensor Processing Unit (TPU) supercomputer software across different layers of the software stack
  • Manage the whole lifecycle across both computing and networking components for Google's Artificial Intelligence (AI) supercomputer/hypercomputers
  • Create system-level debuggability and observability tools in partnership with key stakeholders
  • Collaborate with Silicon, Software, Site Reliability, and Operations teams to drive reliability improvements

Requirements For Software Engineer, TPU, Machine Learning Supercomputer

Python
Java
JavaScript
  • Bachelor's degree or equivalent practical experience
  • 1 year of experience with software development in one or more programming languages (e.g., Python, C, C++, Java, JavaScript)
  • Understanding of distributed systems concepts
  • Knowledge of common Machine Learning (ML) algorithms and how they map to software and hardware operations
  • Knowledge of networking especially at the link layer along with routing algorithms and topologies
  • Ability to build backend software for high performance computing and ML applications

Benefits For Software Engineer, TPU, Machine Learning Supercomputer

Medical Insurance
Vision Insurance
Dental Insurance
Parental Leave
  • Equal opportunity employer
  • Accommodations for disabilities
  • Inclusive workplace culture

Interested in this job?

Jobs Related To Google Software Engineer, TPU, Machine Learning Supercomputer

Software Engineer, PhD, Early Career, AI/Machine Learning

PhD Software Engineer position at Google focusing on AI/Machine Learning, offering competitive compensation and opportunity to work on cutting-edge technologies.

Software Engineer, PhD, Early Career, Machine Learning, Systems and Cloud AI

PhD Software Engineer role at Google working on Machine Learning, Systems and Cloud AI, starting 2025.

Machine Learning Modeling Engineer, Silicon

Machine Learning Modeling Engineer position at Google, focusing on developing custom silicon solutions and ML accelerator software environments for next-generation hardware products.

Engineering Analyst, AI Safety

Engineering Analyst position at Google focusing on AI Safety, data analysis, and machine learning systems to protect users across Google products.

Software Engineer, PhD, Early Career, AI/Machine Learning

PhD-level Software Engineering position at Google focusing on AI/Machine Learning, offering opportunity to work on cutting-edge technologies impacting billions of users.