Microsoft's Azure Machine Learning team is seeking a talented Software Engineer to join their Inference team, focusing on democratizing ML and making it accessible to enterprises, developers, and data scientists. The role involves developing next-generation model serving capabilities, including hosting OpenAI models like ChatGPT and scaling for Bing and Office applications. You'll be working with cutting-edge AI technology, handling billions of daily requests, and tackling challenges at the intersection of AI and Cloud computing.
The position offers an opportunity to work on high-impact projects, optimizing performance for high throughput/low latency scenarios, and building reliable platforms for massive-scale model inferencing. You'll be part of a team that serves all internal and external ML workloads, working with state-of-the-art LLMs and Diffusion models.
As a member of Microsoft's innovative environment, you'll collaborate with geo-distributed teams, leverage the latest hardware stack technologies, and contribute to one of the world's largest GPU fleets. The role requires a blend of technical expertise in Python, Go, and Kubernetes, along with strong collaborative skills. This is an excellent opportunity for someone passionate about AI infrastructure who wants to make a significant impact in the field of machine learning at scale.
The position includes comprehensive benefits such as industry-leading healthcare, educational resources, and parental leave. Microsoft offers a supportive, inclusive work environment committed to equal opportunity employment, making it an ideal place for professionals looking to advance their careers in AI and cloud computing.