30

Looking to transition to ML Engineer? What do you want to know?

Profile picture
Ilya Reznik (Guiding ML Engineers through their career journey. ex-Head of ML, ex-Meta Staff SWE, ex-Adobe)2 months ago

This post is part AMA and part request for your insights.

I’m currently developing a Taro course for Q1 of 2025, focused on navigating the transition to a career in machine learning. If this topic resonates with you, I’d love to hear your thoughts:

1. What are the most pressing questions or concerns you have about making this transition?

2. What is your current role or background (e.g., Software Engineer, Data Engineer, SRE/PE, Data Scientist, etc.)?

About Me:

I’ve transitioned from Software Engineer to Data Scientist to Machine Learning Engineer over the course of my 14 year career. I’ve been an MLE for 10 years, with experience at Adobe, Twitter, Meta, and as Head of MLOps at a Series B startup.

Thank you!

Update: Thank you for all the questions, keep them coming. I put together a YouTube video with a quick outline of the process as I generally recommend it, this is a high level overview and more practical things are coming in a Taro course early next year.

1.6K
27

Discussion

(27 comments)
  • 15
    Profile picture
    Tech Lead @ Robinhood, Meta, Course Hero
    2 months ago

    Just echoing some stuff I've heard across the Taro community and engineers overall:

    1. Do you need a PhD or Master's to break into ML, particularly if you're a junior engineer?
    2. How does the transition vary if you're a junior engineer (blank slate) vs. a senior engineer?
    3. Is it possible to transition via the interview? Let's say you are a high-performing back-end engineer at Google. Would you be able to convince Meta to hire you as an MLE?
    4. Outside of getting a degree, what can you do to gain MLE skills and build them to a point where your ML accomplishments are impressive to recruiters/hiring managers?
    5. Are there certain types of engineers from a domain perspective (data engineer, back-end engineer, Big Data engineer, front-end engineer) where it's easier/harder to make the switch into MLE?
    • 4
      Profile picture
      Ilya Reznik [OP]
      Guiding ML Engineers through their career journey. ex-Head of ML, ex-Meta Staff SWE, ex-Adobe
      2 months ago

      Excellent questions!

      1. The degree does help. It doesn't have to be "in ML" or even computer science, but most of us have graduate degrees and if you don't it will be an uphill battle. I have seen people succeed without a degree, so it is not a "prerequisite" but it helps a lot.
      2. So as you would expect juniors have less to leave behind which usually makes the ROI on transition much more realistic whereas senior engineers in adjacent industries (SWE, DS...) bring more to the table. As a hiring manager, I am very apprehensive of people transitioning as staff+. Realistically, you are not worth the salary at that level for a long time. You could say "why not take a lower level" but most employers prefer not to hire an L5 with 30 years of experience. I have seen it done, but it is harder past a certain level. The worst ML hire I have ever seen was a former Google Distinguished Engineer. He is an impressive software engineer but didn't do the groundwork in ML costing the company I know years on their ML roadmap.
      3. In FAANG, probably no. Experience matters, unless you are L3. A high performing FAANG engineer may convince a startup to interview them for ML role.
      4. Alex, you actually know the answer to this one! So the only accomplishments I care about as a hiring manager are ones that are validated by other people. Getting a paper accepted at a conference, building a project that 10,000 people use, making an open source contribution that maintainer accepts, doing a freelancing project... I don't care about dead code: something only you have ever built and ran, I need to see validation.
      5. MLE is an agglomeration of a lot of various skills from PM (yes, we often have to do PM work because we understand nuances of ML that are hard to communicate to non-engineering functions) to data, coding, etc. As such, every transition has some things that are easy and some things that are hard.
  • 6
    Profile picture
    Friendly Tarodactyl
    Taro Community
    2 months ago

    What’s the difference in technical interviews between ML/SWE?

    Is the interview process standard and uniform like in big tech? ( DSA/Leetcode style questions/Systems) or is it team by team basis?

    • 3
      Profile picture
      Ilya Reznik [OP]
      Guiding ML Engineers through their career journey. ex-Head of ML, ex-Meta Staff SWE, ex-Adobe
      2 months ago

      Excellent question, will for sure address in the course.

      Short and to the point answer:

      • For big companies the process is very standardized. DSA, ML System Design, Behavioral are the main rounds. For Sr Staff+ you will likely be asked a project deep dive (and you should know the tradeoffs that were made and put them in a STAR-like format but with a lot of technical details).
      • For smaller companies there is ML Fundamentals round where you get asked about ML concepts. The design round focuses very narrowly on what the company does.
      • I am also seeing more and more ML coding for non-faang. You can be asked to code a model from scratch (usually sklearn level, not pytorch) or look at pytorch code and find bugs.

      The actual process varies, but it will be some mixture of the above.

      Allow me to be a bit self-promotional, I did publish a YouTube video with a basic walk-through of what I expect to hear from staff-level MLE in ML System Design round.

  • 3
    Profile picture
    Friendly Tarodactyl
    Taro Community
    2 months ago
    1. Questions :

    Is there a way to be a part of the ML world without having to deep dive into the math part of ML?

    The math in ML intimidates me.

    I would like to be an ML Infra engineer. I work as a distributed systems engineer and extending my domain to ML would really help me in my career.

    What is the path that I'd have to follow to become an ML infra engineer?

    I'm not talking about MLOps here but actually building good scalable systems for ML based infrastructure. I don't want to target ML based DevOps roles.

    Check out this job at PayPal which covers what I'm talking about : https://www.linkedin.com/jobs/view/4083052214

    If it's not possible to avoid the math part, how can I become really good at it? I don't want to watch a MOOC from Udemy/Deep learning that covers bits and bites.

    I would prefer a comprehensive learning resource that can take me from 0 to 100.

    1. I'm a software engineer at a web3 based distributed systems company.

      This space sometimes scares me because of how shady some projects might be. They don't really focus on the tech and are just hell bent on the tokenomics. I think ML infra might be a good switch after I get some good experience.

    • 3
      Profile picture
      Ilya Reznik [OP]
      Guiding ML Engineers through their career journey. ex-Head of ML, ex-Meta Staff SWE, ex-Adobe
      2 months ago

      Yes, I would say that if infra is your primary concern (and there is a lot of need for good ML infra people) then fundamental computer science is much more important than math. LLVMs are a must. If you understand derivatives and matrix multiplication you have all the math you need for that role.

      I am working on a comprehensive resource, but it will likely not be as simple as a course or several. In coaching people one-on-one I am seeing a lot of variability so I think the only way for me to do this well is to guide this pretty carefully with regular check-ins and a lot of interaction. Everyone comes from a different place and is going toward a slightly different destination.

    • 0
      Profile picture
      Friendly Tarodactyl
      Taro Community
      2 months ago

      Awesome, that sounds nice. Do you offer 1-1 mentorship services for people looking to transition?

    • 4
      Profile picture
      Ilya Reznik [OP]
      Guiding ML Engineers through their career journey. ex-Head of ML, ex-Meta Staff SWE, ex-Adobe
      2 months ago

      @Friendly Tarodactyl

      I do, but currently working on systematizing it. Should be out shortly, will put a link here when it exists.

  • 2
    Profile picture
    Supportive Tarodactyl
    Taro Community
    2 months ago

    Thanks for doing this :) I have many questions. Some have been asked already.
    I am looking forward to watching the course when it's out!

    Role

    • What are the core skills required for the role?
    • A perspective on a MLE career progression.

    Interviews

    • What are the different interview types? (The equivalent to DSA/System Design/Behavioral for Software Engineering positions)
    • In general, what's the weight of each one of those skills during interviews (based on different factors e.g: seniority/business/size of the company/etc.)?
    • How to best prepare for MLE interviews?
    • Related to questions above : is it helpful to study Leetcode? Are there other helpful resources you recommend?
    • MLE roles at FAANG vs start-up vs other large companies.

    Side Projects

    • What's a good MLE project to include on resume? And what's a bad one?
    • What indicators are hiring managers searching for when looking at a side project?
    • 2
      Profile picture
      Tech Lead @ Robinhood, Meta, Course Hero
      2 months ago
      • What's a good MLE project to include on resume? And what's a bad one?
      • What indicators are hiring managers searching for when looking at a side project?

      I'm not an MLE, but I imagine that the project having actual users is the most important thing, just like it is for every other domain of engineering. It should be as I describe here: https://www.jointaro.com/course/build-side-projects-with-500k-users-coming-up-with-an-idea/what-makes-a-project-valuable/

    • 4
      Profile picture
      Thoughtful Tarodactyl
      Taro Community
      2 months ago

      @Alex the issue with projects having actual users and MLE is that its pretty dang hard to do that without getting mucked into tons of other irrelevant skills for MLE such as frontend/marketing.

      What makes MLE valuable in industry is the data. without data you cant build anything of value. which is why 90% of MLE projects are just stuff like take some public dataset like MNIST and throw some models and make an article which is quite frankly not impressive.

      The thing about ML is that it's almost always a feature, not a product. This is why many companies that tries to make AI the product fails (e.g. rabbit R1, the humane pin)

    • 4
      Profile picture
      Ilya Reznik [OP]
      Guiding ML Engineers through their career journey. ex-Head of ML, ex-Meta Staff SWE, ex-Adobe
      2 months ago

      Let me take these in order:

      Role
      There are too many "core skills" to cover, honestly, so I will take a cop-out answer here and say hunger for learning and never being stuck. I'll cover the technical skills in the course, but those two are foundational. MLEs constantly operate in situations where we don't have enough (data, specifications, time...) and need to have creative solutions. On top of that the field is evolving all the time but you need to know the difference between "hype" and "innovation" and they often look similar. You are continuously learning the fundamentals.

      MLE career progression is basically the same as SWE. Technical or management route are open to you.

      Interviews
      I covered interview types in another answer.

      Preparation depends on where you are coming from and where you are applying. I will say that DSA is pretty fundamental and the bar is the same as SWE. Honestly, preparing for DSA is a really high leverage activity. I am working on a bunch of resources for ML system design (system, example) I will have more resources for other rounds soon.

      FAANG vs startups vs other large companies is a great question and the roles change a lot depending on where you are. Primary differences are breadth over depth (in FAANG you tend to specialize a lot more) and learning opportunities (honestly, you have to be a self-starter to stay up-to-date at FAANG whereas in small companies it is almost annoying how many new things pop up each day).

      Side Projects
      I will echo Alex on the projects. I understand that it seems like marketing and UX are irrelevant but I assure you, they are not. When you work at any company your VP is essentially your VC and you must convince them to "invest" in your project over a hundred others your org can go after. I am not saying that most of your effort should go those, but "selling" useful things for free should not be that hard.

      Plus you can always get creative: email can be your UX where you summarize (for example) US congressional votes and send that to your subscribers every week.

      The only thing that matters about the project is external validation. I have talked about this above, but most HMs ignore "toy" projects. As to where to find data... that problem doesn't go away even when you work for Google :) Every project starts with "if only we had...", showing ingenuity here is an important signal. Not every ML model is a transformer model. I have trained Bayesian models in production with 30 highly curated examples and it performed well. Many projects can rely on generated data.

      This is what I mean by "not getting stuck". Many of these issues are not magically resolved just because you have a job, but the best MLEs always have 3-4 ways we can "go from here" and still accomplish what we are after.

    • 1
      Profile picture
      Supportive Tarodactyl
      Taro Community
      2 months ago

      @Ilya Reznik I appreciate all these insights, this is very helpful! Thanks!

    • 1
      Profile picture
      Thoughtful Tarodactyl
      Taro Community
      2 months ago

      Makes sense, thank you so much. this is very helpful!

  • 2
    Profile picture
    Supportive Tarodactyl
    Taro Community
    2 months ago

    Hey Ilya,

    For those passionate about working in pioneering technology and considering a transition to MLE, but with 10 years of deep experience in another software engineering discipline (e.g., mobile engineering at FAANG companies), how do the economics/compensation of an MLE career compare to SWE?

    Would making the switch be financially attractive, or are the comps similar? I’d love to hear your perspective—both short-term and long-term.

    Thanks!

    • 1
      Profile picture
      Ilya Reznik [OP]
      Guiding ML Engineers through their career journey. ex-Head of ML, ex-Meta Staff SWE, ex-Adobe
      2 months ago

      MLEs make a bit more typically I think. The figure I heard is about 15% more than full stack eng at the same level not sure about mobile.

      But... (there's always a but) you are not at the same level as MLE as you are in the place you have been for ten years. By the time you hit these levels the most certain financial return is to go all in. There are many other reasons to go to ML but financial is probably not it.

      Longer term the ceiling in ML is currently higher. Will it be when you get there? I think so, but nothing is guaranteed.

  • 2
    Profile picture
    Helpful Tarodactyl
    Taro Community
    2 months ago

    How high/low signal are winning Kaggle competitions? I heard online that they're not high signal. From your experience, is this true? Why or why not?

    • 1
      Profile picture
      AI/ML Eng @ Series C startup
      2 months ago

      Low. Kaggle ranking is just a number. Open-source and building in public generates far more inbound than Kaggle. You can generate dozens of inbound interviews this way. Just do cool stuff, then write a little about it. Rinse and repeat.

      The reason for this is people want to see cool shit. A ranking number is boring.

    • 1
      Profile picture
      Ilya Reznik [OP]
      Guiding ML Engineers through their career journey. ex-Head of ML, ex-Meta Staff SWE, ex-Adobe
      2 months ago

      If you are a data scientist, kaggle helps. For MLE it is better than nothing at entry level, but not by much. Kaggle is pretty far from what makes a great MLE, it is like competitive programming not transferring well to distributed systems specialist... Yes, there is overlap, but there are higher ROI ways to spend your time (like building a project people will use).

  • 2
    Profile picture
    Helpful Tarodactyl
    Taro Community
    2 months ago

    I’ve worked as an ML Engineer for 2 years at Microsoft, where I trained tree-based models, managed inference pipelines, and conducted experimentation at Microsoft scale. Later, I transitioned to a project focused on building auto-featurization pipelines, utilizing AutoML and explainability techniques, where I worked for another 2 years.

    After a re-org, I was moved to a non-ML-related backend engineering role, where I’ve been for the past 2 years. Prior to joining Microsoft, I also had 2 years of full-stack development experience. I hold a Master’s degree where I studied machine learning, and I’m passionate about returning to the ML field.

    I’m familiar with fundamental ML concepts such as classification (tree-based models, logistic regression, SVMs), clustering (K-means, hierarchical clustering), and Data Mining (bag of words, TF-IDF), but I’ve never learnt neural networks or deep learning, either professionally or through MOOCs.

    Recently, a recruiter from Meta reached out about Software Engineer, Machine Learning roles.

    • Do you think my current knowledge and experience are sufficient to prepare for the ML design rounds? Or Should I focus on taking an online course on neural networks/deep learning and build projects before interviewing?
    • If possible, could you share a comprehensive list of "must-know" and "good-to-know" topics for ML interviews?
  • 1
    Profile picture
    Staff Software Engineer at Warner Bros. Discovery
    2 months ago

    If someone has 2 years of career break recently and has 8 years of experience in mobile development before break. And, now want to restart his career as an ML engineer. Will companies prefer him and what they want to look for?

    • 2
      Profile picture
      Ilya Reznik [OP]
      Guiding ML Engineers through their career journey. ex-Head of ML, ex-Meta Staff SWE, ex-Adobe
      2 months ago

      Long career break and no ML expertise is going to be tough. Not sure if you have a masters degree, if not, this may be a great time to get one that is ML focused. Start networking right away and try to get an internship in the summer between two years.

      Like I said earlier, if you can show a project that is validated by others (users, open source committers, conference reviewers, etc) that is much stronger.

  • 1
    Profile picture
    Helpful Tarodactyl
    Taro Community
    2 months ago

    Are publications the best way to differentiate yourself from the crowd?

    For MLE's publishing ML papers, how much does publication venue matter? For example if you publish to a tier-1 conference (ICLR, NuerIPS, ICML) vs something less prestigious (AAAI, TMLR, etc.), how much would that weigh against you?

    For generalist MLE's, are papers submitted to specialized conferences (SIGGRAPH, COLT, AISTATS, etc.) still valued? What about if you publish ML-related papers to non-ML journals? For example, applied ML papers to MechE or Bio or ChemE conferences? Will managers (or recruiters) be able to tell the difference between workshop papers and main conference track papers?

    • 3
      Profile picture
      AI/ML Eng @ Series C startup
      2 months ago

      To answer all your immediate questions: Papers are a good way, but they take lots of time and your work is unlikely to be rewarded. There is no certainty that you'll publish a decent paper, if at all. Generally ML people don't care about applied ML papers as much. Worthwhile top tier papers take 12-18 months from idea inception to publication acceptance/release at a minimum. Second tier conferences like AAAI and KD are still good, but less prestigious. Workshop papers are in much lower regard to real papers.

      Some general points:

      • ROI on papers is low, given how much time/effort you're putting in. If you want to do purely model research though, this might be the only way
      • Very few ML engineers are purely doing model work. The majority are doing systems for ML, SWE for ML, or infra for ML.
      • open-source is pretty good ROI if you know what you're doing, there's a Taro course on this
      • Switching from engineering work to science work has it's own risks. It's hard to measure research productivity and you might find yourself at a C-tier research shop with some bums
    • 2
      Profile picture
      Ilya Reznik [OP]
      Guiding ML Engineers through their career journey. ex-Head of ML, ex-Meta Staff SWE, ex-Adobe
      2 months ago

      Elliot Kang gave a good answer! Here is some more:

      So I am actually at NeurIPS right now and polled a few recruiters for you (sample size 6 across a range of companies):

      • generally they follow the academic value system: the higher the conference prestige the more worthwhile.
      • getting a spotlight paper or oral presentation even is much more valuable (I believe acceptance rate for oral is sub-1% at NeurIPS) and getting best paper in the area that the company cares about (which is basically transformer models for everyone now) is the strongest signal.
      • Journals are not considered much, conferences are most interesting.
      • The amount of stuff you have to learn to publish a paper is immense! Honestly this will be an extremely valuable experience to you... but unless you get published, like Elliot said, there is no resume signal.
      • I have a video coming out on 12-16-2024 about my experience at NeurIps 2024 on the https://www.youtube.com/@MLEpath YouTube channel including a quick view from someone who was published where he talks about what it took. TL;DR it is about a year of work
      • One of the hardest things is knowing what the conference cares about this year.
      • Is this the best ROI on your time? Depends!
  • 0
    Profile picture
    Thoughtful Tarodactyl
    Taro Community
    2 months ago

    Can you comment on the difference in skills needed when it comes to different verticals

    1. ML Engineer
    2. Data Scientist
    3. Data Analyst

    Is it possible to get Data Science jobs (not data analytics) without a PhD?

    What even is an ML Engineer? As an ML Engineer will I be training models?

    • 3
      Profile picture
      Ilya Reznik [OP]
      Guiding ML Engineers through their career journey. ex-Head of ML, ex-Meta Staff SWE, ex-Adobe
      2 months ago

      Titles can vary a lot, but generally data analyst uses the stuff data scientist developed to produce one report whereas many users use stuff MLE developed to accomplish their end goals.

      MLEs do train models, but we also own ML systems and some of us specialize in Ops or Infra... We train models, write code, maintain models in production, set strategy...

      You can become a data scientist without an advanced degree.