Domain-Driven Design and microservices have changed the way software engineers work. And I think, they can be used to multiply the productivity of data teams as well. But I also think, they have to be used slightly differently because the common centralized data team is in a very different situation than the common (decentralized) software dev team. This article is about exploring this very different way of working with your central data system. It offers a simple iterative way, just like microservices, to start with the default monolithic one pot to capture it all. It then lets you slowly break […]
Data as Code — Principles, What it is and Why Now?
No, DaC is not just versioning data! It’s applying the whole software engineering toolchain to data. For that, we need principles. This post is part of a small series beginning with: Data as Code — Achieving Zero Production Defects for Analytics Datasets. Image by Sven Balnojan. Data as Code is a simple concept. Just like Infrastructure as Code. It just says “Treat your data as code”. And yet, after IaC appeared on the ThoughtWorks Radar in 2011, it still took roughly 10 years to “settle in” and is still on an uneasy spot where IaC advocates feel they need to remind people […]