This blog discusses the differences between ETL (extract, transform, load) and ELT (extract, load, transform) processes in data integration. It explains how ETL emerged in the 1970s as a way to integrate data into a relational enterprise data warehouse (EDW) by extracting, transforming, and loading it. ETL tools require expertise to install and manage, and the transformation processes are typically only accessible to IT personnel.
However, the advent of big data and data lakes brought significant challenges to traditional ETL. The velocity, volume, and variety of big data strained the capacity, resources, and timeliness of ETL processes. Additionally, data lakes eliminated the need for predefined schemas, and accessing raw data became crucial. These factors led to the emergence of ELT, where the data is extracted and loaded before transformation.
ELT is designed for big data and cloud repositories, offering scalability, performance, and cost benefits. ELT tools prioritize data access for data consumers and allow transparent access to multiple data sources, regardless of storage format. ELT is seen as the natural evolution of ETL for the world of big data.
The blog suggests that organizations should consider ELT tools in specific scenarios, such as when facing challenges with increasing data volume, velocity, and variety, or when data warehousing and ETL costs are becoming unmanageable. ELT is also recommended for those already invested in big data and cloud storage.
The blog acknowledges that data governance, particularly ensuring data veracity from various sources, is a challenge for ELT. However, ELT tools are evolving to incorporate data discovery, validation, lineage, quality, and access control, among other governance aspects. The author suggests that ELT is evolving into EL+T, where transform includes data governance, and new EL+T platforms will enable a broader range of use cases and data consumer groups.
Overall, the blog highlights the evolution from ETL to ELT in response to the challenges and requirements posed by big data, data lakes, and cloud storage, and suggests considering ELT in specific scenarios while acknowledging the importance of data governance in the evolving landscape.
Complete blog can be viewed at https://www.forbes.com/sites/forbestechcouncil/2021/05/04/is-elt-the-ultimate-replacement-for-etl-part-i/amp/