Meta is seeking a Software Engineer to join their Host Networking team, focusing on AI Transport solutions. This role is crucial for managing millions of NICs in Meta's fleet that powers all services and applications, particularly the transport software for Meta's Training and Inference Accelerators.
The position offers an exciting opportunity to work on transport solutions for large-scale AI clusters, where you'll be developing innovative solutions to complex challenges and implementing them in production environments. You'll be working with cutting-edge technology in AI infrastructure, specifically focusing on network interface controllers and transport solutions for distributed fleet of accelerators.
As part of the role, you'll be deeply involved in designing and implementing drivers for network ethernet adapter functions, working with RDMA transport stacks, and managing control functions between hosts and accelerators. The position requires strong expertise in C/C++/Python programming, deep understanding of Linux kernel operations, and experience with transport stack technologies.
The ideal candidate will have a background in Computer Science or related field, with hands-on experience in debugging large-scale systems. Knowledge of Qemu and FPGA Emulation environment would be advantageous. You'll be joining Meta, a company at the forefront of social technology innovation, working on projects that go beyond traditional digital connections into immersive technologies like AR and VR.
This is an excellent opportunity for someone passionate about AI infrastructure and networking, offering the chance to work on systems that power Meta's global services while pushing the boundaries of what's possible in AI transport solutions.