LinkedIn is seeking a Staff Software Engineer to join their AI Platform team, focusing on scaling large model training and serving infrastructure. The role involves pushing the boundaries of AI scaling, working with models containing hundreds of billions of parameters across recommendation systems, large language models, and computer vision applications. The team optimizes performance across algorithms, frameworks, data infrastructure, and hardware to maximize their GPU fleet's capabilities.
The position spans several key areas including Model Training Infrastructure, Feature Engineering, Model Serving Infrastructure, and MLOps. You'll be working with cutting-edge technologies and frameworks like TensorFlow, Horovod, Ray, vLLM, Huggingface, and DeepSpeed. The team actively contributes to the open source community and includes many open source committers.
As a Staff Engineer, you'll lead technical strategy development, implement large-scale distributed systems, improve system observability, and mentor other engineers. You'll work on challenging problems like scaling models with hundreds of billions of parameters, optimizing training and serving infrastructure, and building robust feature engineering platforms handling millions of QPS.
The role offers the opportunity to shape the future of AI infrastructure at one of the world's largest professional networks. You'll collaborate with talented researchers and engineers while building your career and personal brand in the AI industry. LinkedIn offers a hybrid work environment, combining remote work with in-office collaboration on select days.
The position requires strong technical leadership, deep expertise in distributed systems and AI infrastructure, and the ability to mentor others while driving technical excellence. You'll be at the forefront of solving complex challenges in AI scaling while working with a world-class team in a collaborative environment.