NVIDIA, a pioneer in computer graphics and accelerated computing for over 25 years, is seeking a Python Software Engineer to advance their GPU-accelerated data engineering initiatives for Large Language Model (LLM) tools and libraries. This role is central to accelerating pre-processing pipelines for high-quality multi-modal dataset curation.
The position involves developing efficient, scalable systems for de-duplicating, filtering, and classifying training corpora for foundation model LLMs, as well as handling datasets for Retrieval Augmented Generation (RAG) pipelines. The role requires expertise in Python library development, deep understanding of ML/DL ecosystems, and experience with distributed programming frameworks.
As part of NVIDIA's innovative team, you'll work with cutting-edge technology and have access to supercomputers with thousands of GPUs. The ideal candidate should be passionate about releasing early and often, receptive to user feedback, and comfortable evaluating AI models and frameworks for acceleration potential.
The role offers a competitive compensation package ranging from $148,000 to $276,000, plus equity and comprehensive benefits. NVIDIA provides a diverse, supportive environment where innovation thrives and employees can make lasting impacts on the world. The company is committed to fostering diversity and equality in the workplace.
This is an exceptional opportunity for a skilled Python developer with machine learning expertise to join a world leader in AI and accelerated computing, contributing to groundbreaking developments in LLM technology while working with some of the industry's brightest minds.