Technical Program Manager, Machine Learning Operations and Maintenance

Google is a global technology leader that specializes in internet-related services and products, including search, cloud computing, software, and hardware.
Sunnyvale, CA, USANew Albany, OH, USAReston, VA, USA
$168,000 - $252,000
Cloud
Staff Software Engineer
Hybrid
5,000+ Employees
8+ years of experience
AI · Cloud

Description For Technical Program Manager, Machine Learning Operations and Maintenance

Google is seeking a Technical Program Manager for Machine Learning Operations and Maintenance to join their Central Operations team. This role involves managing complex, multi-disciplinary projects related to data center operations, with a focus on Machine Learning workload dependencies, maintenance policies, and global strategies for shutdown/turnaround maintenance.

Key responsibilities include:

  1. Documenting ML workload dependencies on power and cooling infrastructure
  2. Developing and implementing Maintenance SLO policies for Data Center Operations
  3. Creating a global strategy for shutdown/turnaround maintenance
  4. Implementing a planned downtime communications solution for internal and external Cloud customers
  5. Collaborating with partner teams to implement programmatic changes in various processes

The ideal candidate will have:

  • A Bachelor's degree in a relevant field or equivalent practical experience
  • 8+ years of experience in critical operations, global change management, or technical program management
  • Experience managing multiple vendors and external partners in a 24x7 environment
  • Knowledge of electrical/power and mechanical/cooling engineering
  • Experience with global change governance and maintenance in data centers
  • Strong problem-solving and data analytics skills
  • Ability to travel 40-50% of the time as needed

Google offers a competitive salary range of $168,000-$252,000 plus bonus, equity, and benefits. They are committed to diversity, equity, and inclusion, aiming to build a workforce that represents the users they serve. This role provides an opportunity to work on cutting-edge technology and contribute to Google Cloud's mission of accelerating digital transformation for organizations worldwide.

Last updated 24 days ago

Responsibilities For Technical Program Manager, Machine Learning Operations and Maintenance

  • Learn, document and align Machine Learning (ML) workload dependencies on power and cooling infrastructure and various triggers for load shedding
  • Develop, communicate and implement a Maintenance SLO policy for Data Center Operations (DCOps)
  • Develop and implement a global strategy and a playbook for shutdown/turnaround maintenance for facilities operations
  • Design, develop and implement a sustaining planned downtime communications solution supporting both internal and external Cloud customers
  • Work with partner teams to implement programmatic changes to supply chain, resource planning, contracting, security, environmental and others processes impacted from a parallel maintenance approach

Requirements For Technical Program Manager, Machine Learning Operations and Maintenance

  • Bachelor's degree in a relevant field, or equivalent practical experience
  • 8 years of experience in critical operations, global change management, supply chain, risk management, or technical program management
  • Experience in managing and coordinating work with multiple vendors and external partners in a 24x7 or event based time constrained environment
  • Experience in electrical/power and mechanical/cooling engineering and equipment
  • Experience with global change governance and maintenance in data centers, manufacturing or 24x7 operations

Benefits For Technical Program Manager, Machine Learning Operations and Maintenance

Equity
  • Bonus
  • Equity
  • Benefits

Interested in this job?

Jobs Related To Google Technical Program Manager, Machine Learning Operations and Maintenance

Solutions Architect, Indonesia SA Team

Senior Solutions Architect position at AWS helping digital native companies in Indonesia accelerate growth through cloud technology implementation and architecture.

Senior Water Strategy Manager, AWS Water Team

Senior Water Strategy Manager role at AWS focusing on water infrastructure and sustainability for data centers, requiring 10+ years of experience in utility and project management.

Technical Program Manager, Independent Software Vendor Lead

Senior Technical Program Manager role leading ISV technical team at Microsoft, focusing on marketplace solutions and cloud architecture.

Software Engineering Manager - Virtualization

Lead Apple's Virtualization team in developing cutting-edge virtual platform technologies while managing system software engineers and collaborating across hardware and software teams.

Data Center Deployment Engineer L4/L5

Lead data center deployment engineer role at Netflix, managing global infrastructure for streaming services, requiring 5+ years experience in edge networks and data centers.