Zhamak Dehghani came out with the formularization of the “data mesh paradigm shift” in May 2019, and end of this year with a book “Data Mesh – Delivering Data-Driven Value at Scale” on the topic, which I’m really looking forward to! Piethein Strengholt already published a book based on the work at ABN Amro which carries much of the same ideas.
I myself have been writing quite a bit about the idea of a data mesh since my first article “Data Mesh Applied” at the end of 2019, so I thought it would be nice to collect them together here in a somewhat guided form, so to say a handbook you can follow along to understand data meshes more deeply. I would also point you towards the Data Mesh Learning Community which collects lots and lots of content for data meshes.
Part 1: The Data Mesh is pretty natural
For me, the data mesh is just decentralization of the responsibility for data and yes, that differs from the tighter definition of what ThoughtWorks provides.
I wrote an article about why this decentralization is just the natural progression of technology decentralization following DevOps, Micro-frontends, Micro-services.
But given that ThoughtWorks usually says they are not really creators, but more curators of ideas companies actually do & implement, I think it’s fine to have a second looser definition that in my opinion more closely resembles what actually is important.
(To be fair though, ThoughtWorks has been deep into all four of the major decentralization shifts in tech mentioned above).
Part 2: Data Mesh Applied 1 & 2
I also wrote up some ideas on how to implement an AWS S3 based data mesh and a lot about the progress of getting there.
Then after working a bit on a data mesh inside a company, I realized there are very different forms of technical implementations of data meshes, so I wrote up a summary of different kinds of technical implementations of data meshes.
Part 3: Potential Fallacies of the Data Mesh
After I dug through tons of different implementations of data meshes, I stumbled over a bunch of possible fallacies, so I wrote up the three major data mesh fallacies.
Part 4: Implementations over Implementations
I discussed various actual implementations of data meshes over the span of 2021 in my newsletter “Three Data Point Thursday”, both the good and the bad side of things. Here are the links to the newsletter editions with commentary on all of them:
- The implementation done at DeliverHero, based on BigQuery
- HelloFresh with joint dimensions and their Data Mesh
- JPMorganChase and their huge implementation of a data mesh
- Gloo.us and their data mesh based on Kafka
- The data mesh done by Kolibri Games, a great fit for a startup, small and minimalist.
- The data meshes as built by Saxo Bank, ABN Amro and an example from a workshop held by ThoughtWorks