Azure AI Infrastructure team is looking for passionate engineers to build the largest deep-learning infrastructure service at Microsoft. In this role, you will be tasked with building new components to bring the latest innovations in AI Infrastructure onto the Azure AI Platform. You will partner with top engineering talent within Azure AI Infrastructure and across Azure to work on cluster orchestration, job scheduling, storage, networking, containerization, and operating system integration.
Your work will enable various AI languages and run-times on Azure AI Infrastructure to bring distributed deep learning training and inferencing to life. You will build infrastructure components required to build, deploy, monitor, and service highly available and scalable Microsoft Service Fabric and Kubernetes clusters. You will lead development and customer support from the frontline and establish architecture, service excellence guidelines, and a high-quality bar.
We are engineers on Azure AI Infrastructure. We believe that building a planet-scale AI Supercomputer from the ground-up which addresses the fundamental pain-points of data scientists and AI practitioners and takes AI to unprecedented scale is an opportunity of a lifetime.
Azure AI Infrastructure is a globally distributed, multi-tenant service that provides robust, cost-effective, and competitive AI infrastructure (compute, networking, and storage) for AI training and inferencing. By abstracting workloads from underlying infrastructure, Azure AI Infrastructure creates a shared pool of resources that can be dynamically provisioned for full utilization of expensive GPU compute, enabling data scientists to productively build, scale, experiment, and iterate their models on top of a robust, performant, scalable, and cost-effective distributed infrastructure built for AI.
Responsibilities include:
Join us in building the future of AI infrastructure at Microsoft!