NVIDIA is seeking a Technical Leader to manage their GPU Communications Libraries and Networking team, specifically focusing on NVSHMEM and UCX libraries. This role is crucial in delivering communication libraries for Deep Learning and HPC applications that run on massive GPU scales. The position combines technical leadership with team management, requiring expertise in HPC networking and system software.
The role involves leading a team that develops critical communication libraries that directly impact application performance across thousands of GPUs. You'll be working with cutting-edge technologies including NVLink, PCIe, and high-speed networking solutions like Infiniband and Ethernet. This is an opportunity to push technological boundaries and contribute to NVIDIA's vision of advancing accelerated computing.
As a Software Engineering Manager, you'll be responsible for both technical leadership and team management. The role requires deep technical expertise in HPC networking, system software, and communication runtimes, combined with strong leadership skills to mentor and grow your team. You'll collaborate with internal and external partners, researchers, and various engineering teams to shape product roadmaps and drive innovation.
The ideal candidate brings 10+ years of industry experience, with particular expertise in HPC networking or system software, and 4+ years of management experience. Strong programming skills in C/C++ and Linux environments are essential, as is a deep understanding of computer system architecture and operating systems principles. Experience with parallel programming models, RDMA, and high-performance networking technologies would be particularly valuable.
Working at NVIDIA means joining a company at the forefront of AI and accelerated computing, with a supportive environment that encourages innovation and impact. The role offers competitive compensation, including equity, and the opportunity to work on technologies that are transforming multiple industries.