Data as Code — Principles, What it is and Why Now?

No, DaC is not just versioning data! It’s applying the whole software engineering toolchain to data. For that, we need principles. This post is part of a small series beginning with: Data as Code — Achieving Zero Production Defects for Analytics Datasets. Image by Sven Balnojan. Data as Code is a simple concept. Just like Infrastructure as Code. It just says “Treat your data as code”. And yet, after IaC appeared on the ThoughtWorks Radar in 2011, it still took roughly 10 years to “settle in” and is still on an uneasy spot where IaC advocates feel they need to remind people […]

Data as Code — Principles, What it is and Why Now? Read More

How Conways Corollary Wrecks Your Data Organization

Opinion Conway’s law has an evil corollary that goes unnoticed in the dev world but wrecks your data org. “You think it’s a hack, but all you’re hacking apart is the value of your data.” (the author) Image by Sven Balnojan. Melvin Conway, a brilliant computer scientist who also invented the notion of a coroutine, has become pretty famous in the last 20 years for a law named after him: “Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization’s communication structure.” Turns out this is very important as we move towards […]

How Conways Corollary Wrecks Your Data Organization Read More

Data as Code — Achieving Zero Production Defects for Analytics Datasets

Notes from Industry How to apply the true Data as Code philosophy to achieve close to zero production defects using the tried & true methods from software development on data. Yep, zero defects! That’d be awesome. Image by the author. Data teams spent close to 60% of their time on operational things, not producing value. They also experience a large level of bugs in their data systems, according to the datakitchen study & Gardeners survey. Yet, in the software development world, we already have the philosophies in place that allow high-performing teams to deliver both quickly and at a high level of quality, […]

Data as Code — Achieving Zero Production Defects for Analytics Datasets Read More

4 Excellent Examples of Agile “Nearsighted” Roadmaps

How GitHub, Meltano, Airbyte, and Atlassian manage to stay focused on bigger goals while still staying flexible and agile. “As a PM, you must plan for the near term milestones (more detailed) as well as for the long term strategy (more broad), and everything in between. Considered as a spectrum, these form a nearsighted roadmap. This will enable you to efficiently communicate both internally and externally how the team is planning to deliver on the product vision.” (from the GitLab Product Handbook) ““In preparing for battle I have always found that plans are useless, but planning is indispensable.” (Dwight D. Eisenhower) […]

4 Excellent Examples of Agile “Nearsighted” Roadmaps Read More