Robotics Reliability Engineer

A team of scientists, engineers, and machine learning experts advancing artificial intelligence for widespread public benefit and scientific discovery.
DevOps
Mid-Level Software Engineer
Contact Company
5,000+ Employees
AI · Robotics

Description For Robotics Reliability Engineer

At Google DeepMind, we're at the forefront of artificial intelligence research and development. We're seeking a Robotics Reliability Engineer to join our Robotics Research Engineering team. In this role, you'll work closely with software engineers and robotics AI researchers to enhance the reliability, performance, and observability of our robot software systems. Your primary focus will be on improving robot data collection and AI policy evaluation reliability by identifying and resolving issues across the robotics stack, and supporting significant horizontal scaling.

Key responsibilities include:

  • Developing metrics to pinpoint systemic issues and prioritize their resolution
  • Collaborating with researchers, software engineers, and operators to debug and solve workflow-related issues
  • Designing reliable tools for monitoring and deploying robotics systems at scale
  • Leveraging tools to streamline diagnosis and management of our robot fleet and workstations

We're looking for candidates with:

  • A background in software engineering or site reliability engineering
  • Experience with distributed systems
  • Passion for robotics and improving complex robotic systems
  • Skills in developing metrics for system health monitoring
  • A collaborative mindset

Additional advantages include familiarity with ROS, open-source deployment tools, experience with Python, C++, Docker, Shell scripting, and Cloud infrastructure, and the ability to analyze complex systems.

At Google DeepMind, we value diversity and are committed to equal employment opportunity. We use our technologies for widespread public benefit and scientific discovery, collaborating on critical challenges while prioritizing safety and ethics. Join us in shaping the future of AI and robotics!

Last updated 2 months ago

Responsibilities For Robotics Reliability Engineer

  • Develop metrics to identify systemic issues across hardware failures, cell reset/reconfiguration, debugging, deployment, and data quality
  • Work with researchers, software engineers, and operators to debug and solve workflow-related issues
  • Design reliable tools for monitoring and deploying robotics systems at scale
  • Leverage tools to streamline diagnosis and management of robot fleet and workstations

Requirements For Robotics Reliability Engineer

Python
Linux
Kubernetes
  • Background in software engineering or site reliability engineering
  • Experience building, deploying, running and debugging distributed systems
  • Passion for robotics and improving complex robotic systems
  • Experience developing and utilizing metrics for system health monitoring
  • Collaborative mindset

Benefits For Robotics Reliability Engineer

  • Equal employment opportunity

Interested in this job?

Jobs Related To Google DeepMind Robotics Reliability Engineer

System Development Engineer, Annapurna Labs, Machine Learning Accelerator Systems - Fleet Triage

System Development Engineer role at AWS's Annapurna Labs, focusing on ML infrastructure automation and system operations at global scale.

Data Center Operations Support Engineer, DCO

AWS Data Center Operations Support Engineer position focusing on infrastructure management, technical support, and operational excellence for cloud services.

System Development Engineer, FBA Capacity Management and Planning

System Development Engineer role at Amazon focusing on FBA capacity management and planning, requiring 4+ years of experience in systems/software development and infrastructure.

Systems Engineer, AMER Controls Support

AWS Infrastructure Services seeks Systems Engineer for critical infrastructure management, focusing on Linux systems, networking, and automation.

Support Engineer, CMT Promotions Excellence

Support Engineer role at Amazon combining DevOps, Systems, and Software Engineering skills to automate operations and improve service delivery.