Microsoft is seeking a Principal Software Engineer to join their GPU optimization team focusing on cloud-scale engineering and LLM model optimization. The role involves working with Microsoft Copilot to optimize GPU cloud infrastructure usage, balancing low latency inferences with large-scale LLM operations. The team operates across Europe with flexible work arrangements including remote options.
The ideal candidate will have extensive experience in GPU optimization, distributed systems, and high-performance computing. They will work directly with data scientists and engineers to develop, train, and run LLM models while optimizing for latency, throughput, and total cost of ownership. The position requires strong technical skills in languages like Python, Java, and JavaScript, along with expertise in GPU technologies, particularly NVIDIA.
This is an opportunity to work at the forefront of AI infrastructure optimization, contributing to Microsoft's mission of empowering every person and organization globally. The role offers a collaborative environment with a focus on innovation, growth mindset, and inclusive culture. Benefits include comprehensive healthcare, educational resources, parental leave, and various other perks.
The position requires at least 6 years of technical engineering experience (preferably 10+) and deep knowledge of GPU optimization, distributed systems, and performance scaling. The successful candidate will drive efficiency practices, lead technical initiatives, and collaborate across organizational boundaries to maximize hardware utilization for next-generation AI models.