Microsoft's Search Ads Understanding team is seeking a Senior Software Engineer specializing in GPU Inference Optimization to join their innovative team. This role focuses on optimizing GPU inference for large language models (LLMs) and small language models (SLMs) to support various Ads tasks including query rewrite, Ad relevance, and Ad creative generation.
The position offers an exciting opportunity to work with cutting-edge AI technology and hardware, developing fundamental abstractions, programming models, runtimes, libraries, and APIs. The team is responsible for building an intelligent system that matches advertisers' "Ad display" with users' "queries" using advanced AI models and sophisticated engineering systems.
The ideal candidate will have strong expertise in GPU optimization, C/C++ programming, and deep learning frameworks. You'll be working in a fast-paced environment, collaborating with researchers and developers to solve complex technical challenges in building a full end-to-end AI stack.
Microsoft offers a comprehensive benefits package, including industry-leading healthcare, educational resources, parental leave, and investment opportunities. The company maintains a strong commitment to diversity and inclusion, fostering a culture where everyone can thrive and contribute to their mission of empowering every person and organization on the planet.
Working in Beijing, China, with a hybrid work arrangement (up to 50% work from home), you'll be part of a team that drives user satisfaction, advertiser ROI, and Bing revenue through innovative solutions and technical excellence.