Boson AI, an innovative startup in the AI space, is seeking a Senior High Performance Computing Engineer to join their team in Toronto. Founded by renowned experts Alex Smola and Mu Li, the company is at the forefront of developing generative AI models for language, audio, and entertainment.
The role offers an exceptional opportunity to work with cutting-edge technology, including NVIDIA H100 and A100 GPUs, managing over 20PB of storage, Terabit networking, and hundreds of computers. You'll be responsible for operating GPUs, network, and filesystem in the datacenter deployment, requiring strong problem-solving skills and an adaptable learning mindset.
As a Senior HPC Engineer, you'll be deeply involved in managing high-end GPU clusters, handling system deployments, and maintaining critical infrastructure components. The position demands expertise in various technologies, including Slurm, MAAS, Ceph, Infiniband, and NVIDIA deepops, along with strong networking knowledge.
The ideal candidate will bring substantial experience in high-performance computing, data center operations, and large hardware cluster management. Your role will be crucial in designing, deploying, and maintaining production-grade machine learning systems at scale, making this an excellent opportunity for someone passionate about infrastructure and AI technology.
Working in a hybrid environment with a competitive salary range of $150,000 - $250,000, you'll be part of a team pushing the boundaries of AI technology. This role offers the chance to work with state-of-the-art hardware and contribute to the development of next-generation AI tools.