Posted on

This was originally a “Thoughtful Friday” reserved for newsletter subscribers. If you like the content below, I suggest you subscribe as well!

In my head, the three biggest challenges for the world of data are very clear. They are so clear because I believe in two fundamental ideas about our world and its relation to data:

  • 1) Every company —- will turn into —-> a software company —- will turn into —> a data company – having data at the heart of its business strategy.
  • 2) Data will continue to grow exponentially


Mainly because I simply believe them, but of course also, because I’ve seen glimpses of that and there is both quantitative and qualitative evidence that this is simply the future we’re heading towards.


If you put 1+1 together, that means, data will power everything. And everything. If you take a look at the numbers on the exponential growth, this future will be there “soon”. The world in 30 years will be dominated by data compared to today.

Today we’re in the stone ages of data….

Technology

I’m a technology lover, a fan of complex things, and a deep diver into a lot of data-related tech stuff. I’ve been blown away by machine learning models able to write texts, and the fact that we have both, machines that can (with a lot of caveats) build better machine learning models than humans as well as machines that can surpass humans on the most human task there is pattern recognition.

Software Engineering

At the same time, I’ve had the fortune to study and observe the software engineering world which, even though it’s a young discipline, has made some major jumps in the past 20 years. Micro-services, DevOps, infrastructure as code make software engineering an approach to engineering which is able to deliver value with a laser-sharp focus.


It’s doing that by carrying over a lot of best practices from the manufacturing & engineering world, where they have been long established and able to increase both quality and speed at the same time.

The Missing Pieces

Moving from company -> software company seems to not be the issue. A lot of companies are making that move. I think it’s due to the reason we are already getting pretty good at extracting business value from software.


But for three reasons, I really feel like we’re only 1% of where we could be at when it comes to extracting business value from data.

1. I Don’t Like To Drive

I don’t like to drive myself. We all know that the “senses” that cars have actually are already 10x better than mine because they simply sense a lot more. They can see all around me at all times, much farther than I can, and so on.


They also are able to plan ahead 2-3 steps, whereas I usually am not able to do that, not while driving.


They are able to sense better than I am, because sense is about pattern recognition, and machines are already surpassing humans on pattern recognition tasks.


If you take all of these individual technology pieces together, I simply shouldn’t have to drive myself anymore. And yet, we are still here, slowly learning to walk…


In German, there is a distinction between the two words “Technik” and “Technologie” which I always enjoyed, where “Technologie” is basically what I describe above, the realm of possibility, and “Technik” is what actually is usable, out there.

We’re simply moving very slowly from technology to actual usable products, and that confuses me not just in the area of autonomous driving, but everywhere.

2. My Lawnmower

My robotic lawnmower is a much better mower than I am. I think that’s because of a bunch of reasons, which all are because he’s doing things differently than I am. He’s called “BB-8” btw.:


1) BB-8 is cutting only 1-2mms of, thereby mulching the grass.
2) BB-8 is mowing almost constantly.
3) He’s doing a basic random walk, thereby actually mowing my grass very evenly, but also taking extremes I sometimes ignore.
4) His target goal is statistical, to mow the lawn completely on average every couple of days. Mine is a note, it’s discrete, mow everything.


If I now think about an automatic coder, I feel like he would bring a lot of the same benefits. An automatic coder would probably:
1) Do a lot of random walks, thereby uncovering patterns (like in Coding Patterns) no human has ever encountered.
2) He would work constantly thereby constantly improving software systems, whereas current development cycles are tied to humans that actually sit in front of the computer.
3) He would probably work in super small increments and ship them constantly, thereby getting a lot more feedback, more feedback a human could ever process.
4) He would be statistically focused as well! Humans always strive for correct code, APIs that deliver the right response 100% of the time; yet humans are terrible at perfection. A statistical approach however would almost always suffice, getting the right response 99,9% of the time, and a small deviation in the rest of the cases is more than enough, but usually 10 times easier to build.


In short, I feel like an automatic coder would also be a lot better than human software engineers are at coding. That doesn’t mean the software engineer is not needed anymore, just like for the lawnmower, he’s then free to do the more fun stuff, like setting up the parameters and actually focusing the automatic system on its goal, the part that actually creates the most value.
But for some reason, besides GitHubs’ “CoPilot”, next to no actual products explore this space currently. And yet it is necessary to scale software engineering to every company out there.

3. The “Data” Guy

The final reason I feel we’re just 1% there is simple: We still have a “data” person, and most “data” people don’t feel like they are doing software engineering. That’s ok, I felt the same way starting out, and even though I did dive deep into DevOps and software engineering-related topics, I still had a hard time really feeling like I was actually doing software engineering.


But if in the future, everything is about data, then kind of everyone will be simply an engineer/ a developer.


The problem is just, that both the mentality, the training, and even the tooling landscape are still in a very different space.


The challenges are there, the business opportunities are there. The exponential growth of data and thus its exponentially increasing value for each individual company & individual is there. And yet, these three major challenges are not really addressed by the data companies of this world; Not a single one of them.

Leave a Reply