Hello everyone, hope you all are doing great. This is the best time of the year in Pune, India, where we all are celebrating the Ganesh festival after a gap of a couple of years due to the pandemic. Wishing all my readers a great time with friends & family during this festive season.
In today’s post, I’ll introduce you to a new concept known as “Reverse ETL”.
Most of the readers of this newsletter would be aware of ETL - the process of extracting data from source systems, transforming it & loading it into the data warehouse.
The warehouse is the repository where data from various source systems is persisted after going through a series of transformations, validation & standardization processes to integrate it into one single platform.
From a data engineering perspective, DWH can be considered the final layer of your data platform. However, extracting data out of the warehouse & sending it to other systems & applications is often considered the last leg of your data engineering process.
This very process of extracting data from the warehouse & sending it to other systems is known as Reverse ETL!
There are many products in the market (which can be considered as part of the Modern Data Stack) to implement the Reverse ETL process.
All these years, we have been implementing a similar process to extract data from the warehouse & send it to downstream applications. These requirements are not new & most of the enterprises implementing warehouses would have provisions for the same.
So why do we need dedicated tools for this when it can be implemented using custom code?
These modern tools offer features that can help you quickly implement the Reverse ETL process with much less effort. Some of these features are listed below
Out-of-box connectors for SaaS products like Salesforce, Marketo & many others. No custom code is required for calling APIs or to get tokens for authentication.
Direct mapping between the objects (like tables) from DWH(source) & SaaS(target). No custom code is required to map the source to target attributes.
Easily configure actions to be taken like insert, update or append records in SaaS applications.
Manual, Continuous or Scheduled sync between DWH & SaaS systems. The schedule can be for fixed time intervals like hourly, daily, weekly etc.
Reverse ETL is a process to sync your DWH (single source of truth) data with your external SaaS applications (System of action) that can help you to take required actions based on the DWH data. Reverse ETL tools can help you directly sync your data between DWH & target SaaS applications without writing any custom code
Eg
Your DWH can give you insights into a customer’s credit card spending trends. A customer with a heavy spending pattern indicates an opportunity to increase the customer’s credit limit. A marketing SaaS application ( “system of action”) can take this action to send marketing emails to the customer to apply for this credit limit increase. Reverse ETL can facilitate this by syncing the data between DWH & SaaS applications, thus bridging the gap between insights to action.
This has also led to a new stream of work called Operational Analytics. You can read more about it here.
Some of the leading products that offer Reverse ETL are
If you want to dive further into this topic, you can go through the articles & blogs published by Census.
I hope you enjoyed today’s post. Do comment & let me know your thoughts or experiences of working on various Reverse ETL use cases.
This is new concept to me, Why do we need reverse ETL when we can consume DWH data into BI ?