Rivos is seeking a DL Communications Collectives SW Engineer to join their team working on improving the Deep Learning ecosystem. This role involves designing and implementing highly optimized communication collectives libraries similar to UCC and NCCL. The ideal candidate will work closely with hardware and software teams to ensure efficient data communication and synchronization across multiple AI accelerators in a distributed system.
Key responsibilities include building communication components of an AI Software Stack, porting AI Software to new hardware platforms, and optimizing communication within AI applications. The engineer will design and implement various communication collectives, optimize algorithms for multi-node clusters, and ensure low-latency, high-bandwidth communication across multi-GPU setups.
The ideal candidate should have a strong background in GPU architectures, parallel and distributed algorithms, and experience with network interconnects. Proficiency in communication collectives libraries, deep learning frameworks, and low-level performance optimization on GPU architectures is crucial. The role requires excellent problem-solving skills, strong communication abilities, and the capacity to work effectively in a fast-paced, collaborative environment.
Rivos offers the opportunity to work with industry veterans, learning technical and organizational skills while contributing to open-source projects. This position is perfect for someone passionate about advancing AI technology and eager to tackle complex challenges in distributed computing and machine learning.