Microsoft's Azure Machine Learning team is seeking a talented Software Engineer to join their Inference team, focusing on democratizing ML and making it accessible to enterprises, developers, and data scientists. The role involves working with cutting-edge AI technology, including OpenAI models like ChatGPT, and supporting Bing and Office applications. You'll be part of a team handling billions of daily requests and developing next-generation model serving solutions.
The position offers an exciting opportunity to work at the intersection of AI and Cloud computing, where you'll be responsible for designing and building highly reliable platforms for model inferencing at massive scale. You'll tackle challenges in high throughput/low latency scenarios and lead performance optimization initiatives.
Working in a collaborative, innovation-driven environment, you'll contribute to Microsoft's mission while handling state-of-the-art LLMs and Diffusion models. The role requires expertise in Python, Go, and Kubernetes, with a focus on scalable solutions and performance optimization. You'll be part of a geo-distributed team, working with the latest hardware stack technologies including CUDA and infiniband.
This is an ideal opportunity for someone passionate about machine learning infrastructure, with strong technical skills and a collaborative mindset. The position offers comprehensive benefits, including industry-leading healthcare, educational resources, and work-life balance support.