What the heck are ETLs?

And other insights by the ✨ fabulous ✨ Nolwenn Belliard, Lead Data @ Tiller

What's the proven best way to learn a new language?

By surrounding yourself with native speakers. One of my cofounders pointed out that this is exactly what I do when I attend our 'tech meetings'.

Guilty. 🥺

I end up picking up new words, and trying to understand them.
I can even 'fake it' for a while and have a conversation using those, without really knowing what they mean.

The latest 'new word'? ETL.

I actually did some research on this topic (faking it is great, yet, up to a point!).

I met Nolwenn Belliard, Lead Data at Tiller (recently acquired by SumUp), who gave me invaluable insights about ETLs, but also a fresh perspective about Data teams.

Anh-Tho Chuong: As the Lead Data at Tiller, how would you describe your responsibilities there?

Nolwen Belliard: I joined Tiller two years ago, and my initial focus has been on bringing value to internal teams. This means providing the right analytics alongside the client lifecycle, working along Operations/Marketing/Sales teams.

For example, I've set up our datawarehouse, our BI tool.

Beyond setting these tools, there was a significant heavy lifting to understand current business processes, how data is used, how we could model it, document it, and make our internal processes more efficient.

Once these tools are set, it's a continuous effort to make sure internal teams adopt a data mindset, and use our tools. This phase is often overlooked.
This means organizing regular trainings, making sure the tools and dashboards they have are the most useful to them.

My next challenges will be around putting Machine Learning algorithms in production, and using data in our product, for external users.

A.C.: You majored in medical engineering, how did you decide to work in tech and data? And why?

N.B.: I picked medical engineering because it mixes maths, biology, and IT, and I love spaces that are at the crossroads of several topics.

I've also majored in entrepreneurship, and then I've looked for a career that could mix entrepreneurship and tech.

I've found this sweet spot within data roles in startups, where my job is to create value with data. I only use tech to create business value. Some people use Excel spreadsheets, I use Python and SQL.

A.C.: Let's jump into today's topic. ETL are often mentioned by Growth teams, but it still seems like an obscure topic. How would you describe what an ETL is to a non-tech person?

N.B.: In an ultra-simplified version, I would say that an ETL is like a Zapier for data. It takes data from a data source (such as a CRM, our own product) and sends it in a datawarehouse.

A.C.: ETL stands for 'Extract', 'Transform', and 'Load'. So, from your explanation, I understand ETLs extract data from a data source and loads it in a warehouse. How about the 'T of Transformation'?

N.B.: Actually I've never used ETLs, I've used ELs (Extract and Load) or ELTs (Extract, Load, and Transform). With ELTs, data is transformed once it is loaded in the datawarehouse.

The best known 'ETLs' are Fivetran and Stitch, and to me, they are more ELs than ETLs. They have very simple transformation features such as table joins.

I think 'transformation' of data is the step which adds the most value (otherwise we're just talking about moving data around with ELs). That's why it's usually best to use dedicated tools to perform it, such as dbt, or custom code with Airflow and Python.

A.C.: I've also heard of 'ELT'. I got that it means you 'Load' data before 'Transforming' it... but what are the pros and cons to use and ELT rather than an ETL?

N.B.: I think the main difference is when you use an ETL, by 'transforming' before 'loading', you end up having data which is ready for a specific usage: for instance for a specific BI tool.

But this also means that you're less flexible than if you load 'untransformed/raw' data in a warehouse. Having raw data in the warehouse means you can transform it afterwards, as you need, as many times as you want.

A.C.: What do you wish you knew sooner about ETLs?

N.B.: Choosing an ETL is exactly like choosing any internal tool: you can even develop one internally (just as some companies develop their own CRM instead of choosing Salesforce).

You should evaluate each tool through a ROI (Return on Investment) lens, each one will have its own specificities.

When I chose our ETL, it was still quite new in France - we were the 1st startup to use Fivetran in France -, and interviewing other users would have helped me grasp the key differences between Fivetran and Stitch (the 2 leading ETLs) quicker.

A.C.: ETLs and ELTs are tools used by data and tech teams. However it's great for non-tech teams (Growth, Sales, Marketing, Customer Success) to understand the data technical stack. What are the other main concepts or tools they should know about, according to you?

N.B.: I think the main concept to keep in mind is 'the unique source of data truth'. Where is it? Who owns it?

Data is a team work that involves a lot of discipline, meaning not everyone should modify how data is structured (in a CRM for instance): each team is responsible for inputing the data in an agreed format, if they want the data to make sense afterwards.

A.C.: Data engineering often seems like a black box to non-tech teams, such as Growth teams. What piece of advice would you give to them to work hand in hand with data teams?

N.B.: Be proactive. Challenge the data you have, push ideas/projects. Data teams are here to bring business value.

Use the tools and dashboards that you can access, help data teams improve them for you.

Also, educate yourself. Data teams usually organize internal trainings (if not, suggest it!). I'm also a teacher at Emil School, and there's a specific bootcamp about data for non-techs.

A.C.: Wow amazing! Emil school sounds great! My co-founders actually created a few courses about data too: Raffi about SQL for BI, and Julien around tracking1. I'm also wondering, your role being very cross-functional, how do you prioritize projects?

N.B.: I always refer to the company OKRs (Objective and Key Results). From OKRs, the Data team will support projects with data, or lead specific projects: internal training, tool stacks scalability, for instance.

Also, every 2 weeks, I meet the heads of Product, Finance, Growth, Customer Success, for a 30-min update. We also mutually identify ways my team can support them on their day-to-day projects, so that we have a mix of day-to-day impact and mid/long-term projects.

A.C.: Let's say I'm joining a startup tomorrow, as the 1st 'data person'. What piece of advice would you give me?

N.B.: Like in any business, talk to your clients.

They are, in most cases, internal clients. Make sure to understand their job, their objectives, their challenges, and how they operate today.

It will help you make sure you design a data system that makes sense to them, and that they adopt it.

I have myself spent my first 3 months at Tiller modeling the existing stack, processes, data flows.

Also, there are so many resources online you can learn from, be curious, don't be afraid to go out of your comfort zone and explore.

A.C.: According to you, what's the biggest misconception about data engineers?

N.B.: That they are just developers.

It's true that no engineers are 'just developers', but data engineers are all the more closer to the business.

And with new off-the-shelf tools such as ETLs, data engineers will have more time to bring impact to the business, rather than building infrastructure.

And it's only the beginning!


Thank you so much Nolwenn! I wish I could attend the internal trainings you lead at Tiller. 😉

Want to go further?

  • Dive into perspectives around Modern Data Infrastructure with these 2 must-reads: by a16z and dbt founder

  • Book a personalized 25-min ‘data for business’ coaching session with Lago team. What we usually help on: your tool stack, tracking plan, team organization, recruitment.

Book now

  • Join our invite-only community: comment this post, and I’ll follow-up!

1

Raffi's bootcamp about Data visualization, and Julien's emailing course on Tracking