Spotify is seeking a Senior Data Engineer to join their Speak team, which is the in-house text-to-speech (TTS) team supporting products like DJ, AI Voice Translation, and exciting new unreleased products. The role focuses on building world-class speech technologies that can power the next generation of personalized generative voice products at scale.
As a Senior Data Engineer, you'll be responsible for building large-scale speech and audio data pipelines using frameworks like Google Cloud Platform and Apache Beam. You'll work on machine learning projects powering new generative AI experiences and help build state-of-the-art text-to-speech models. The role involves learning and contributing to the team's understanding of best practices and techniques for building data pipelines for large-scale generative models, including cleaning, filtering, classifying, and labelling.
You'll collaborate with other engineers, researchers, product managers, and stakeholders, taking on learning and leadership opportunities. The ideal candidate has strong Data Engineering experience, particularly with high-volume, heterogeneous data and distributed systems. Proficiency in Python and experience with data processing frameworks like Beam, Dataflow, or Spark is essential.
This position offers the opportunity to work on cutting-edge speech technology projects in a dynamic, collaborative environment. You'll be part of a team that values quality, agile processes, and responsible experimentation. If you're passionate about data engineering, machine learning, and building innovative voice products, this role at Spotify could be an excellent fit for your career growth.