Adrian
2 min readSep 19, 2020

--

If one needs to explain ETL and data streaming in terms of cakes and belts to an IT specialist or even to a manger, then something is most probably wrong with the approach (and one should have remained to baking).

There are different business scenarios one needs to consider and the most important requirement is the timeliness of the data - does the business requires real-time data, near real-time data or a few hours or one day of delay is enough? Except the first two scenarios, solutions based on ETL or ELT are enough for most of the demands.

Just because there are new toys in IT, it doesn't mean one needs to acquire the respective tools (unless somebody plays the baby role, and needs the respective toys with any price, and the price is often high in respect to the benefits, especially because the tools need several years to reach the required maturity).

Data streaming might be new, though data integrations based on the SOA and similar technologies are already 20 years old. Comparing data streaming with SOA makes more sense, as they seem to support similar usage scenarios. SOA is great when implemented correctly, though most of the integrations imply some challenges (data quality, troubleshooting, costs, data and process coverage) with no benefit for the BI/Analytics scenarios.

ETL and ELT are more appropriate for data warehousing, BI/Analytics or Data Science because one has enough flexibility, the costs are relatively small and the required resources are easier to acquire. Sure, the two architectures have their limitations, though the changes are higher to reach the objectives.

One of the important lessons to be learned in IT is "even if you can, it doesn't mean you should". Just because you can use data streaming, it doesn't mean you should use it in each scenario where data integration or simply data movement is needed. There are business scenarios in which a given architecture is appropriate and others in which is not. That's the challenge for many specialists - which tool fits the best to a given scenario.

On the other side, there's the trend to oversell the capabilities of new technologies and sales pitches are the best example for it. Only when one starts using a technology the problems arise - the devil hides (almost) always in details. Sure, new technologies will make hopefully a difference, they have certain potential, though the potential is relative.

It would be for myself more interesting to know what the business scenarios are, where a given tool provides proven benefits, which are/were the respective benefits and what are the challenges, than playing with belts and cakes. I see quite often, especially on platforms as medium attempts to sell more in a title than the common sense allows. Old technologies and architectures will continue to exist as long the newer don't address the challenges of the former, as long they don't guarantee lower costs or other tangible benefits.

--

--

Adrian

IT professional/blogger with more than 24 years experience in IT - Software Engineering, BI & Analytics, Data, Project, Quality, Database & Knowledge Management