Hugging Face, the fastest growing AI platform with over 5 million users and 100k organizations, is seeking a Machine Learning Engineer focused on Fast Optimized Inference. This role is perfect for passionate engineers interested in creating specialized ML libraries for real-world applications. You'll work on developing software similar to text-generation-inference, focusing on industrial-level usage and scalability. The position involves creating specialized code building upon their open-source foundation, with 400k+ Github stars across their libraries.
The role combines hands-on development with performance optimization and production management. You'll be responsible for developing ML-specific software, ensuring system reliability, and monitoring production environments. The ideal candidate should be proficient in Python, Rust, and specialized Cuda kernels Frameworks, including transformers, Keras, or PyTorch.
Hugging Face offers an inclusive, development-focused environment where you'll work with industry-leading professionals. They provide comprehensive benefits including flexible remote work, health/dental/vision coverage, parental leave, and equity participation. The company strongly values diversity and community contribution, supporting the broader ML/AI ecosystem through collaborative scientific advancement.
This position offers a unique opportunity to impact AI democratization while working with cutting-edge technologies. You'll be part of a progressive, decentralized team developing solutions that enhance user experiences and push the boundaries of AI applications. The role combines technical expertise with real-world impact, making it ideal for engineers passionate about advancing AI technology while maintaining practical applications.