How to leverage side projects as a Data Engineer?

Mid-Level Data Engineer at Taro Community10 months ago

I'm not too happy with my current job, I feel like career growth prospects are dim and the work is not challenging enough. Given this, my best career move is to find a job at a different company.

Given that the work at my job is not good, I want to use side projects (which Taro tells me has a ton of benefits) to be able to get "equivalent" job experience so that when I interview at other companies I'm able to leverage the experience with awesome side projects to do well.

More specifically most data engineering jobs want experience with airflow and pyspark. My current job only has me doing SQL stuff, so I want a side project that leverages airflow/spark

Problem: If I focus on just building apps/web apps to get users then that wont help my data engineering career because most apps dont need someone with a full pyspark pipeline with airflow and whatnot

Conversely, if I focus on building an awsome data engineering pipeline, I'm likely not solving any real world issues and have 0 users, but the skills will help.

Problem 2: If I focus on consumer apps I'll have to learn react (which I 100% dont use at work) and spend time doing backend stuff which isnt helping me grow personally as a DE because I read on Taro that you should be going deep, not wide.

6464 Views

88 Comments

Discussion

(8 comments)

1
Rahul Pandey
•Tech Lead/Manager at Meta, Pinterest, Kosei
10 months ago

My current job only has me doing SQL stuff, so I want a side project that leverages airflow/spark

This doesn't answer your question about doing a side project, but can you shoehorn your current job to be more amenable to the stuff you want to work on?

If your company is big enough, it should have a lot of diversity of technologies and people. Can you find the areas that are interesting and work on those?

The benefit of "double dipping" your day job with side projects is that you'll learn about how to use the tech in a production environment, with people who have more experience than you. This will be much faster for you to ramp up and learn transferable skills for the next job.

I'd pursue this route even if it's not the perfect tech alignment, e.g. Hadoop instead of Spark.

The other option is to make open-source contributions, if you have some approachable tool or application which uses these data analytics pipelines.
- 0
  Alex Chiou
  •Tech Lead @ Robinhood, Meta, Course Hero
  10 months ago
  +1: Just getting better scope at your current job would be way better than trying to figure out a meaningful side project
  
  You don't need your manager for this either. Go around your organization to data engineers you trust and ask them if they have anything meaningful on their backlog. If they're senior/staff level, they almost certainly do. From there, just pick up those tasks and do them. This situation is the best as there's mutual benefit:
  
  You get meaningful work
  
  They get a worker bee to delegate to and help them scale (they can claim some portion of your impact as they were the one who gave you the task)
- 0
  Mid-Level Data Engineer
  •Taro Community
  10 months ago
  Unfortunately I cant. I work for a pure classic finance company (not hedge fund) where I do mostly analytics/engineering. Most of the work is doing sql (+ managing DB) -> extracting insights -> making powerpoint slides/dashboards
  
  We don't have the volume of data to be doing spark and we also don't have any production models so we dont need airflow so it's not something I can push within the org. It's also not a tech company and I'm part of an analytics division
  
  The ceiling for this analytics/engineer job is pretty low which is why I want to upskill to get the higher paying data engineering roles
- 1
  Alex Chiou
  •Tech Lead @ Robinhood, Meta, Course Hero
  10 months ago
  I see, the answer here is pretty obvious then hehe: [Course] Ace Your Tech Interview And Get A Job As A Software Engineer
  
  I wouldn't gatekeep your job search on lacking skills - This will lead to you studying and building side projects forever. I recommend just applying a lot and seeing where that takes you. If you have spare time, you can study/build in your remaining free time.
  
  The important part about applying (especially if you apply a lot) is that you'll get data. If you're noticing common themes across rejections, you can target your outside studying to fill that specific gap. Many engineers don't realize that interviewing is actually a data collection exercise in disguise 🕵️
- 1
  Mid-Level Data Engineer
  •Taro Community
  10 months ago
  Thanks Alex, it helps. I was curious if you have any general advice for people who are looking to get more interviews where users as a measure of impact is not super applicable? aka roles in infra/backend/data?
  
  Also I get that getting users means the project is impressive which makes it a better project which would lead to more interviews. But why are projects with users better? Isn't a project with users just giving me signal about your marketing/sales/product skills and not engineering skills? Especially if you're building a utility app/one time use app e.g. background remover
0
Alex Chiou
•Tech Lead @ Robinhood, Meta, Course Hero
10 months ago
Side projects are admittedly very tricky as a Data Engineer as the role is inherently not-user-facing. It is hard to make something that is easily shareable and gets tons of users.

I think a better path might be do open-source contributions and build up industry-leading data engineering libraries/components. Here's a good list: https://github.com/gunnarmorling/awesome-opensource-data-engineering

From there, follow the advice here: [Course] Become An Open Source Master
- 1
  Mid-Level Data Engineer [OP]
  •Taro Community
  10 months ago
  Thanks alex. The issue with open source is that (from my limited experience) most projects are at the infra layer while most jobs expect you to be a pro at the application layer. That is building pyspark pipelines and using airflow and stuff. I feel like the overlap between infra layer and application layer is not that much?
  
  I've briefly worked on open source for ml and a lot of it needed me to know stuff at the compiler layer or the pytorch low level stuff but when I try to train an ml model that level of depth and the problems faced doesn't translate much?
  
  In other words working on ml/data infra is very different application of ml/data infra and I'm not sure if working on open source is the best path to gain better data engineering skills
- 1
  Alex Chiou
  •Tech Lead @ Robinhood, Meta, Course Hero
  10 months ago
  Hmm, then maybe you can build a side project just for the learning as you mentioned? It won't help you get interviews as it likely won't get users, but learning is still valuable (and it can help you pass interviews once you get them).
  
  The other option is to find additional scope at your current role as I talked about in my reply to Rahul on this thread.

Other Great Discussions

How to Learn/Practice Clean Code, particularly by oneself?

Data Engineer at Financial Company

14.1K views

5 comments

2 years ago

What technologies or stacks should I create a project to add in my resume for entry level software jobs?

Data Scientist at Tata Consultancy Services

7.5K views

3 comments

3 years ago

Finding a mobile app idea - How to do it?

Anonymous User at Taro Community

6.5K views

5 comments

2 years ago

How to start contributing to open source?

Senior Software Engineer at Taro Community

6.2K views

8 comments

a year ago

How to avoid going down the rabbit holes when learning new things?

Anonymous User at Taro Community

6K views

2 comments

2 years ago