LinkedIn, the world's largest professional network, is seeking a Principal Staff Software Engineer to join their AI Platform group. This role is part of the AI Training team, responsible for developing and maintaining highly available and scalable deep learning training solutions that power LinkedIn's rapidly growing AI use cases.
The position offers a hybrid work arrangement, allowing flexibility between remote work and office presence in Mountain View, CA, San Francisco, CA, or Bellevue, WA. The team handles training infrastructure for models with hundreds of billions of parameters, spanning recommendation systems, large language models (Generative AI), and computer vision models.
As a Principal Staff Software Engineer, you'll be at the forefront of scaling LinkedIn's AI model training capabilities, working with cutting-edge technologies and frameworks. The role involves optimizing training performance across algorithms, AI frameworks, infrastructure software, and hardware to maximize the potential of LinkedIn's GPU fleet, which comprises thousands of latest-generation cards.
The position offers an exciting opportunity to work with open source technologies, as the team includes many open source committers (TensorFlow, Horovod, Ray, Hadoop, etc.). You'll be involved with advanced technologies like LLMs, GNNs, Incremental Learning, and Online Learning, while also working on Training infrastructure.
Key responsibilities include leading technical strategy development, implementing large-scale distributed training systems, improving system observability, mentoring other engineers, and collaborating with the open-source community. The role requires extensive experience in software development, deep learning systems, and technical leadership.
The compensation package is competitive, ranging from $207,000 to $340,000, complemented by comprehensive benefits including medical, dental, and vision insurance, 401(k), parental leave, and various other perks. This is an excellent opportunity for a seasoned engineer to make a significant impact on AI infrastructure at scale while working with a talented team at one of the world's leading professional networks.