ML Research Engineer Internship, SmolLMs pretraining and datasets - EMEA Remote

AI platform building company democratizing good AI with over 5 million users & 100k organizations sharing 1M+ models, 300k datasets & apps.
France
Machine Learning
Software Engineering Intern
Remote
AI

Description For ML Research Engineer Internship, SmolLMs pretraining and datasets - EMEA Remote

Hugging Face, a leading platform in AI development with over 5 million users and 100k organizations, is seeking a ML Research Engineer Intern for their SmolLMs team. This exciting opportunity focuses on advancing small language models that enable cheaper inference and on-device running, promoting customization and privacy.

The role involves working with state-of-the-art infrastructure, including a scalable CPU cluster and an H100 cluster with nearly 100 nodes. You'll be part of the SmolLM team, contributing to building high-quality pre-training and post-training datasets, and implementing cutting-edge architecture and training techniques to develop state-of-the-art models.

The ideal candidate should be passionate about training LLMs and building high-quality datasets, with strong Python skills. You'll have the opportunity to work on developing the best small models in the field, collaborating with a team that's pushing the boundaries of AI technology.

Hugging Face offers a supportive and inclusive work environment, emphasizing diversity and professional growth. The company provides flexible working arrangements, comprehensive development opportunities, and strong community engagement in the ML/AI field. Their open-source libraries have garnered over 400k+ stars on Github, demonstrating their significant impact in the AI community.

This internship offers a unique opportunity to contribute to groundbreaking research in small language models while working with cutting-edge technology and a talented team. Whether you're in the office or working remotely, you'll be supported with the resources and mentorship needed to succeed in this role.

Last updated a month ago

Responsibilities For ML Research Engineer Internship, SmolLMs pretraining and datasets - EMEA Remote

  • Work with the SmolLM team on building next generation of small language models
  • Iterate on datasets and models
  • Work with distributed training infrastructure
  • Build high quality pre-training and post-training datasets

Requirements For ML Research Engineer Internship, SmolLMs pretraining and datasets - EMEA Remote

Python
  • Proficiency in Python
  • Passion for training LLMs and building high-quality datasets
  • Cover letter explaining interest in open-source work at Hugging Face

Benefits For ML Research Engineer Internship, SmolLMs pretraining and datasets - EMEA Remote

  • Flexible working hours
  • Remote work options
  • Office visits opportunity
  • Workstation support
  • Conference and training reimbursement
  • Educational development support

Interested in this job?

Jobs Related To Hugging Face ML Research Engineer Internship, SmolLMs pretraining and datasets - EMEA Remote

Machine Learning Engineer Internship, Hardware Optimization

Machine Learning Engineer Internship focusing on hardware optimization and model deployment across various platforms at Hugging Face.

ML Research Engineer Internship, OS Agents - US Remote

ML Research Engineer Internship position at Hugging Face focusing on developing OS Agents for GUI interaction using LLMs, combining AI research with practical applications.

Machine Learning Engineer Internship, Hardware Optimization - EMEA Remote

Machine Learning Engineer Internship focusing on hardware optimization and AI model deployment at Hugging Face, working remotely with cutting-edge technologies.

Machine Learning Engineer Internship, Hardware Optimization

Machine Learning Engineer Internship focusing on hardware optimization and AI model deployment across various platforms at Hugging Face.

ML Research Engineer Internship, OS Agents

ML Research Engineer Internship position focused on OS Agents at Hugging Face, working remotely in the United States.