more detail

Data Lake architectures vertigo — do you get it?

In the previous post, I explained what a data lake is and its main business benefits. This time around we will have a brief look at related and most commonly adopted data architectures.

A data lake has just recently become a mainstream solution adopted by businesses who want to stay on top of their data. Digital transformation was only the beginning, in the year 2035 data creation is forecasted to reach 45 times that of the year 2020 (source StatistaCharts). I believe that value extracted from data is growing in line with the growth of its volume and so data platform implementations are following suit. Therefore, it is not a surprise that new data architecture patterns are cropping up and data lake has gone through its first evolutions. There are now three distinct data lake implementations and architectures — centralised ( just data lake ), de-centralised ( data mesh ) and a third an integration of data warehouse and data lake called Lakehouse.

When we talk about data lake we usually refer to centralised architecture, certainly having data all in one place has its benefits and works great with organisations having small to medium data footprints. However, for those with a large production of data across an organisation, a decentralised data architecture is necessary. Enter data mesh, an architecture that organises data by a specific business domain (i.e marketing and sales ). Allowing for more ownership to the producers of a given dataset and allowing better connection with data consumers.

Last but not least, those organisations that already utilise data warehouse and data lake may want to implement Lake House architecture which integrates the two together. From an operational standpoint this results in avoiding issues related to data gravity — unnecessary data movement and data redundancies, business-wise robust analytics allows for faster time to market.

Since we’ve covered architectural pattern types related to the data lake in my next posts we will have a more detailed look into each architecture.

As always, for any data platform needs, contact us at DataPhoenix — a small end-to-end data solutions provider.

Image source AWS.