Hugging Face, the fastest growing AI platform, is seeking a Machine Learning Engineer to focus on Fast Optimized Inference. As part of our mission to democratize good AI, you'll join a platform serving 5M+ users and 100k+ organizations. The role involves creating specialized libraries for real-world ML use cases, building on our open-source foundation to develop industrial-grade solutions.
You'll work on developing specialized software similar to text-generation-inference, focusing on scalability and performance optimization. The position requires expertise in Python, Rust, and CUDA kernels Frameworks, including Transformers, Keras, or PyTorch. You'll be responsible for enhancing software reliability, monitoring system health, and driving innovation in our production environment.
We offer a collaborative, inclusive environment with offices in NYC and Paris, though we're largely distributed. Our benefits include flexible working hours, comprehensive health coverage, parental leave, equity compensation, and professional development support. We're committed to building a diverse, equitable workplace where all team members can thrive.
Join us in advancing AI technology while working with industry-leading professionals. You'll be part of a community that values open collaboration and scientific advancement in the ML/AI field. If you're passionate about creating impactful AI solutions and want to contribute to a company that's actively shaping the future of machine learning, this role offers an excellent opportunity to make a difference.