Captions is the AI-powered creative studio empowering millions of creators worldwide to make their video content stand out. Based in NYC, we're a team of ambitious, experienced engineers, designers, and marketers building multi-modal foundational models for features like text-to-video, avatar generation, video-to-video translation, talking face generation, and 3D reconstruction.
As a Research Engineer in Machine Learning, you'll:
- Train, implement, and deploy ML models driving product innovation
- Apply scientific principles to implement state-of-the-art algorithms for generative computer vision and video technologies
- Experiment with advanced neural network architectures
- Collaborate with cross-functional teams to integrate ML models into scalable systems
- Stay current with the latest research in machine learning and computer vision
Preferred Qualifications:
- Masters in computer science or related field and 3+ years of industry experience
- Strong background in computer vision and generative models (Diffusion, Video Generation, NeRFs, Gaussian Splatting, GANs)
- Expertise in Deep Learning frameworks (TensorFlow, PyTorch)
- Strong understanding of CS fundamentals
We offer comprehensive benefits, team off-sites to exciting locations, and the opportunity to work on cutting-edge AI technology impacting millions of users. Join us in our mission to empower the next billion creators!