How To Build the Next Mega Open Source Project

8 lessons learned from a 3-dimensional framework for understanding how to turn your open-source project into the next WordPress or Linux. (Image by Sven Balnojan, on basis of the photo by Markus Spiske on Unsplash) “Open Source projects exhibit natural increasing returns to scale. That’s because most developers are interested in using and participating in the largest projects, and the projects with the most developers are more likely to quickly fix bugs, add features and work reliably across the largest number of platforms. So, tracking the projects with the highest developer velocity can help illuminate promising areas in which to get […]

How To Build the Next Mega Open Source Project Read More

COSS: 7 Models to Develop & Price Open-Core Products

How a commercial open-source software company should develop & price open-core products to fight the hyper-cloud “service-wrappers”. (Photo by Tim Mossholder on Unsplash; Are you still open for contributions?) Dbt Labs, formerly Fishtown Analytics, recently did a large Series C. In the announcement blog post, Tristan Handy, CEO of dbt Labs outlined the major risk he currently sees for open-core products like the one dbt Labs sells: commoditization by the hyper-clouds. Turns out, Sid Sijbrandji, CEO of GitLab, also a company selling an open-core product, thinks very much alike. He has a thorough analysis of how an open-core product can […]

COSS: 7 Models to Develop & Price Open-Core Products Read More

How Conways Corollary Wrecks Your Data Organization

Opinion Conway’s law has an evil corollary that goes unnoticed in the dev world but wrecks your data org. “You think it’s a hack, but all you’re hacking apart is the value of your data.” (the author) Image by Sven Balnojan. Melvin Conway, a brilliant computer scientist who also invented the notion of a coroutine, has become pretty famous in the last 20 years for a law named after him: “Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization’s communication structure.” Turns out this is very important as we move towards […]

How Conways Corollary Wrecks Your Data Organization Read More

Data as Code — Achieving Zero Production Defects for Analytics Datasets

Notes from Industry How to apply the true Data as Code philosophy to achieve close to zero production defects using the tried & true methods from software development on data. Yep, zero defects! That’d be awesome. Image by the author. Data teams spent close to 60% of their time on operational things, not producing value. They also experience a large level of bugs in their data systems, according to the datakitchen study & Gardeners survey. Yet, in the software development world, we already have the philosophies in place that allow high-performing teams to deliver both quickly and at a high level of quality, […]

Data as Code — Achieving Zero Production Defects for Analytics Datasets Read More

4 Excellent Examples of Agile “Nearsighted” Roadmaps

How GitHub, Meltano, Airbyte, and Atlassian manage to stay focused on bigger goals while still staying flexible and agile. “As a PM, you must plan for the near term milestones (more detailed) as well as for the long term strategy (more broad), and everything in between. Considered as a spectrum, these form a nearsighted roadmap. This will enable you to efficiently communicate both internally and externally how the team is planning to deliver on the product vision.” (from the GitLab Product Handbook) ““In preparing for battle I have always found that plans are useless, but planning is indispensable.” (Dwight D. Eisenhower) […]

4 Excellent Examples of Agile “Nearsighted” Roadmaps Read More

How To Estimate The Value of Data Products

The Business Value of data products Is often miscalculated. Learn these two rules to calculate it correctly WSJF and the problematic Value for data products. Image by the author. For me, a product manager, the weighted shortest job, or what is called the cost of delay changed my perspective on understanding value. That’s what we want to do as product managers, maximize value. And the essential ingredient in that formula is the business value of a task or job. For data products, for data-heavy products, machine learning solutions, business intelligence systems, in short everything that has data at its core, I […]

How To Estimate The Value of Data Products Read More

Trunk Based Development For Data & Analytics Engineers

Getting Started How to avoid the merge hell, speed up delivery of business value, reduce defects, and live happily ever after in your data warehouse. Faster development, fewer defects on deployment to production with trunk-based development in data workflows. Image by the author. “We needed an extra day to merge the transformation branches together”, “Ah yeah but there was a bug once we finally got the data to production, so we had to redo some stuff for another 2 days”,… sound familiar? To me, it seems like data and analytics engineers are particularly prone to run into the “merge hell” or […]

Trunk Based Development For Data & Analytics Engineers Read More

The Ultimate Machine Learning Product Checklist

Use 7 simple questions to find machine learning opportunities, even without any technical knowledge (Photo by Markus Spiske, Unsplash) Machine learning, AI, Data Science all carry lots of scary and complicated concepts like deep neural networks, cross-entropy, optimization…. Enough scary words to scare off any product manager but the really tech-savvy from even thinking about integrating machine learning into their products at all. But that, in turn, makes it hard for a company to get all the value out of their machine learning engineers if most product managers shy away from employing them. I like to use a dead simple […]

The Ultimate Machine Learning Product Checklist Read More

There’s More Than One Kind of Data Mesh — Three Types of Data Meshes

Opinion On Redshifts, Data Catalogs, Query Engines like Presto, and the troubles of machine learning engineers to get their data. Image by the author. The author, confused between lots of different data mesh architectures. Data Meshes are the hot & trending topic in data & analytics departments. Implemented at big companies like Zalando, and moved from the “Trial” to the “Assess” status of the ThoughtWorks Technology Radar, within just one year. Yet the results I’ve seen are not overly impressive. Quite a few articles raising concerns have appeared throughout the past year, and at least I have gotten quite a bit […]

There’s More Than One Kind of Data Mesh — Three Types of Data Meshes Read More

Three Surprising Books Every Data Guy Should Read…; ThDPTh #2

Refactoring, Working effectively with Legacy Code, and Test-Driven Development for Data Guys. …on software engineering. Hi, I’m Sven. I think data will power every piece of our existence in the near future. I collect “Data Points” to help understand this near future. If you want to support this, please share it on Twitter, LinkedIn, or Facebook. Here are your weekly three data points: Refactoring, Working effectively with Legacy Code, and Test-Driven Development. Why three software engineering books for data guys? Because I believe every data team should be treated as an agile development team. 1 Refactoring by Martin Fowler Whenever I take […]

Three Surprising Books Every Data Guy Should Read…; ThDPTh #2 Read More