This was originally a “Thoughtful Friday” reserved for newsletter subscribers. If you like the content below, I suggest you subscribe as well! In my head, the three biggest challenges for the world of data are very clear. They are so clear because I believe in two fundamental ideas about our world and its relation to data: 1) Every company —- will turn into —-> a software company —- will turn into —> a data company – having data at the heart of its business strategy. 2) Data will continue to grow exponentially Mainly because I simply believe them, but of […]
… you into three fallacies, the need to build any platform at all, the need to build a data mesh, and lots of coupling inside the platform. (based on Photo by Thomas Tucker on Unsplash) “The most valuable resource is no longer oil, but data.” to put it with the words of The Economist. To extract that value from data is the new frontier for companies of this century. Data meshes appeared in 2019 to change efforts around data fundamentally. In my words, data meshes are pretty simple (but not easy). “An organization has a data mesh, if it has […]
14 thoughts on the economics of the open-source data space and how to become the next MongoDB or Databricks Image by the author. The data space is booming, with companies like mongoDB (valued at 18 billion USD), databricks (30 billion), or Confluent, and many others. The startup space is overflowing with money and lots of founders want a share of the pie. But in my opinion, the data space is set up to be dominated by open source solutions in the near future. Open source spaces have a very clear winner takes most dynamic making them extremely hard to compete. And […]
No, DaC is not just versioning data! It’s applying the whole software engineering toolchain to data. For that, we need principles. This post is part of a small series beginning with: Data as Code — Achieving Zero Production Defects for Analytics Datasets. Image by Sven Balnojan. Data as Code is a simple concept. Just like Infrastructure as Code. It just says “Treat your data as code”. And yet, after IaC appeared on the ThoughtWorks Radar in 2011, it still took roughly 10 years to “settle in” and is still on an uneasy spot where IaC advocates feel they need to remind people […]
In 2006, the company LinkedIn launched a new feature called “People You May Know”. This “prompt” turned out to have 30% higher click through rates than any other prompt in use at LinkedIn. It created millions of additional views and connections. The team also went on to create a bunch of additional machine learning products and helped to foster a deeply data-driven culture at LinkedIn. In their journey LinkedIn uncovered two important principles. The first principle is well explained by DJ Patil: “After all, what is a social network if not a huge dataset of users with connections […]
Dashboards, Graphs, Reports, Spreadsheets, OLAP Cubes, or direct SQL Access?
Data Meshes are the hot & trending topic in data & analytics departments. Implemented at big companies like Zalando, and moved from the “Trial” to the “Assess” status of the ThoughtWorks Technology Radar, within just one year. Yet the results I’ve seen are not overly impressive. Quite a few articles raising concerns have appeared throughout the past year, and at least I have gotten quite a bit of question & confusion about the topic after publishing my first article about data meshes.