As an engineer on the ML Data Platform team at Cruise, you will be responsible for building and supporting a petabyte-scale data platform in the cloud and providing powerful foundations for Cruise's ML Data Platform tools, frameworks, and services. Your responsibilities will include:
- Ensuring scalable, transparent, and reliable data ingestion and management
- Developing fast, robust, and spike-resistant data consumption, data mining, and processing tools for the entire company
- Developing orchestration for large-scale post-processing and computational pipelines
- Leading the development, optimization, and productionization of the next-generation data processing platform using Beam and Spark in the cloud
- Building self-serve capabilities to help customers adopt the next-generation data processing platform
- Using the latest cloud technologies to own, design, implement, and test scalable distributed data systems in the cloud
- Championing engineering excellence by continuously improving systems and processes
- Owning technical projects from start to finish, contributing to the team's product roadmap, and being responsible for major technical decisions and tradeoffs
- Effectively participating in team planning, code reviews, and design discussions
- Considering the effects of projects across multiple teams and proactively managing conflicts
- Working together with partner teams and organizations to achieve cross-organizational goals and satisfy broad requirements
- Conducting technical interviews and playing an essential role in recruiting activities
- Effectively onboarding and mentoring junior engineers and/or interns
The ideal candidate will have:
- Experience building a data processing system using Beam / Spark and its ecosystems from the ground up
- Experience optimizing data processing clusters for cost efficiency and performance
- Experience building serving systems capable of delivering data at high-throughput, low-latency, and high QPS in a cost-efficient and spike-resilient manner
- Experience building full ML model lifecycle solutions - from feature engineering to training, validation, deployment, and monitoring
- Experience building scalable infrastructure on the cloud with Python or Java/Scala (or similar)
- 10+ years of experience working with big data
- BS, MS, or Ph.D. in Computer Science, Electrical Engineering, Mathematics, Physics, or another relevant field; or equivalent real-world experience
- Passion for self-driving technology and its potential impact on the world
- Attention to detail and a passion for seeking truth
- A track record of efficiently solving complex problems
- Startup mentality - openness to dealing with unknown unknowns and wearing many hats
Bonus points for:
- Demonstrable expertise in building end-to-end data ingestion, processing, and serving systems at petabyte scale from the ground up
- Proficiency in writing SQL queries for analytic purposes
- Relevant publications
Cruise offers competitive salary and benefits, including medical/dental/vision insurance, subsidized mental health benefits, paid time off and holidays, parental leave, 401(k) matching, fertility benefits, and more. The company is committed to building a diverse, equitable, and inclusive environment, and encourages applications from all qualified candidates.