How a commercial open-source software company should develop & price open-core products to fight the hyper-cloud “service-wrappers”.
Dbt Labs, formerly Fishtown Analytics, recently did a large Series C. In the announcement blog post, Tristan Handy, CEO of dbt Labs outlined the major risk he currently sees for open-core products like the one dbt Labs sells: commoditization by the hyper-clouds.
Turns out, Sid Sijbrandji, CEO of GitLab, also a company selling an open-core product, thinks very much alike. He has a thorough analysis of how an open-core product can potentially be developed & priced.
This article is a deep dive into this issue and explores Sids’ thoughts with real-world examples. I believe it’s essential to uncover the assumptions hidden in Sids’ approach, as well as the alternatives to understand what the version of this is you can apply to your own unique business.So let’s see what Dbt and the other open-core companies are currently trying to do!
What is Open-Core?
Most commercial open source companies iterate through a couple of business models, before arriving at open-core. I enjoyed timescaleDBs short analysis of the most common five models:
- Offer paid support for a completely open-source product
- Offer paid hosting for a completely open-source product
- Do something about the license and make it “not so open” anymore but paid
- Have one open-core completely open-source, and another set of features that are paid (typically in a hosted solution then)
- Have a hybrid license where the product can be used without the license, but the extra features are unlocked only with it (technically that’s really open-core just not with a hosted solution — the key here is to have everything in an open repository)
Both the company GitLab as well as the company Starburst started out offering paid support, and then slowly iterated through business models until they arrived at open-core. As you can see, this apparently very popular business model has at least 5 different flavors. If you want to know what potential downsides of 1–3 & 5 are, I suggest you listen to either Sid or Justin Borgman.
The Single Biggest Benefit of Open-Core
Open-source companies are able to use all the development power of the crowd, the fast feedback, and the marketing & sales from their open source project. To utilize that it needs to be successful, and as explained in my previous article, this means you basically got to be really successful with your project, the number one or two in the space. And given that 98% of open source projects fail, that’s even harder than it sounds!
So, you want to open up as much as possible and make setup & development as easy as possible. If you choose other business models, the incentives of your organization will be adversely aligned, on….
- paid support: you’re incentivized to make the docs hard to read because that’s what people are paying for…
- paid hosting: you’re incentivized to make the setup really hard, or at least not easy, not the “5-minute installation”
- twiddle with the license, you’re incentivized to develop on the licensed side only!
This is also exactly what both Sid & Justin explain when you listen to their journeys. That sounds like a setup for the failure of an open-source project. Although of course there are a lot of big exceptions, most of which switched to 1–3 later on.
So what do you do? You simply put everything into the open-core that you’re not planning to monetize. Now that becomes the key question, what feature do customers pay for?
The Potential Problem with Open-Core
The hyper-clouds, AWS Google Cloud, and Azure are lately known to offer “wrapped services”, that is a paid hosting variant of an open-source project. This means this service might become a commodity and competition moves to cost-based competition. That’s pretty bad because price-based cost-based competition has its equilibrium/its endpoint at zero margins.
Is service-wrapping truly a problem?
First of all, everyone can offer “wrapped services” and I’m pretty surprised so few companies have done so in the past. I don’t see a good reason for that, so I expect more companies to offer “wrapped services” in the future, not just hyper-clouds. In fact, I’d love someone to offer a “5-minute setup for a ready to use BI chain including ingestion, data warehouse, data discovery tool, dbt & BI tool”.
The hyper-clouds deserve a bit more attention because they operate at near 0 costs, and a COSS company is not able to compete at that level just purely based on cost.
But no one said they have to.
Embrace The Genius of the And
From the outside, and I’m definitely on the outside, it might appear that Elastic, MongoDB, Confluent, and some other companies feel like they are under the “Tyranny of the OR”. Elastic thought they had only two options:
- keep elasticsearch completely open-source and get commoditized, in the end, take a serious business cut.
- not keep it open-source, to sustain the business.
That belief is what Jim Collins calls the “Tyranny of the OR”. It might be a bit behind the decision of Dbt labs to not open-source parts of the GUI and the CI integrations they have for dbt. But as Jim Collins points out, the world is never either/or. If you look closely, you can embrace the “Genius of the AND”:
“George Merck II explained this paradox in 1950: I want to…. express the principles which we in our company have endeavored to live up to…. Here is how it sums up: We try to remember that medicine is for the patient. We try never to forget that medicine is for the people. It is not for the profits. The profits follow, and if we have remembered that, they have never failed to appear. The better we have remembered it, the larger they have been.”
I am certain, if you think principled and keep on looking, you can find a way to keep your open-source project completely open-source, while still growing your business like crazy. In fact, I think you can align these two things. GitLab has found its answer to making this possible, but I am certain there’s an infinite pool of options.
Why This Might Not Be a Problem
Elastic, mongoDB, Confluent, all of these COSS companies went through some serious battles with the hyper-clouds. Yet it seems they largely have been untouched by the results although I wouldn’t consider any of the innocent. In fact, there’ve been some serious blows that went both directions.
My personal experience with hyper-cloud “wrapped services” is that they mostly suck for most companies with higher demands. They seem targeted at two customer segments:
- The ones aiming at deep integration with the specific cloud. This is a big “no-no” in my mind, in fact, you usually want to aim to reduce both the component level coupling as well as the cloud-specific coupling.
- Companies who just want to get things to run.
Of course, this is very dependent on the product, mongoDB, elasticsearch or Kafka, Apache Airflow all have hosted versions on AWS and are very likely used by very different customer groups.
Dbt Labs Response
Sid Sijbrandji’s response, and the one I share, is pretty simple: Develop your products’ paid features orthogonal to what the hyper-clouds target or even could target.
Example Dbt Labs: I don’t know the exact customer base of the roughly 2–3,000 paying dbt customers. If we assume for a second they are small- medium sized companies, and mostly smaller units inside an organization, not the whole organization. Then it seems like the company has a problem. Because this customer group is likely to be price sensitive, and actually cares about getting things to run fast, there is grounds for competition on cost & getting up fast.
An AWS “dbt service wrapper” could take away a good portion of this customer base.
But Dbt Labs has something to offer, AWS cannot even develop. They have the world’s best analytics engineers, they are the no. 1 source I would go to if I were to transition to an analytics engineering setup. And because of their background, they already seem to have that rolling. Exactly as with GitLabs in the enterprise version, you get (or will get) a “Solutions Architect” which helps you get set up with dbt inside your company and turn on the integrations to your current systems.
So Dbt Labs is going into that exact direction, orthogonal to what the hyper-clouds can offer, to highly individualized setups, a focus on the prime credibility they already built through their open source project.
That also means, following my line of thinking, Dbt Labs could put a lot more features into their open-core than they previously did to maximize the power of their open-source project.
Let’s take a look at some alternatives, keep in mind, it’s not just about what you price, but really about how you develop your product.
Cycle Stage Model
The cycle stage model means you slice your product’s features according to some cycle. For GitLabs it’s the software development cycle, for Dbt it’s the data model development cycle. You slice & then price accordingly.
Example: Dbt Labs actually does a bit of “Cycle Stage Pricing”, they chose to exclude a whole suite of “cycles” from the open-core, the integrations to CI & CD systems. To be fair, this is available in the free one-person version, but still, they chose to exclude this cycle from the open-core.
But excluding this from the open-core also means severely limiting the community contributions, feedback, and a good portion of marketing on whole cycles. This is of course problematic because companies usually need the whole cycle, not just one.
Currently, Dbt Labs can afford that, because they are still one of the only SQL transformation-only tools out there, but in the future, open-source solutions might appear that don’t limit the scope of the open-core to just one stage.
It’s the reason GitLab has features across all the Software Development Lifecycle in their open-core, not just one or two stages.
Company Size Model
Company-sized pricing means bigger companies should pay more. To make them pay more, companies usually don’t just increase the number of “users” but also try to select features that big companies are most likely to use.
Example: The COSS company prefect chooses to incorporate a bit of company size based pricing inside their model. As far as I understand it, they push that down to the open-core, which they don’t have to. In the open-core you don’t have multiple users, you need to purchase for that.
Where is the problem? I think the problem very much parallels what GitLab experienced. Small companies might need quite a few of even enterprise features. “Run histories” and “Audit trails” which sounds more like “access to the logs” will very likely be important to most small companies. So there’s a mismatch between the pricing model and the actual customer needs across both, the paid and the open-core version.
This means they actually limit the profits they could draw from their paid version, and in addition, because they push this down into the open-core also limit the community involvement and thus future growth.
In general, a company-sized model is also more prone to commoditization because it usually aims at cost.
In the maturity model means again, you slice your features across some dimension like the maturity of a company for e.g. DevOpsSec, or Data Democratization, or anything you can think of.
Example: Databricks pricing model seems to realize that the buyer-side is usually not a manager but mostly either a director or an executive, so higher up. It’s probably why they simply scale governance & security up across their three tiers and change next to nothing else. But the SQL Computation doesn’t fit well inside that model. Excluding something like running SQL queries on top of a data lake seems to be aimed mostly at the maturity of a company across the data democracy dimension.
This is also a good example of the problems that come with it: Small companies sometimes need “all of the help they can get to get started”. They might be at the very beginning of setting up a data team, but they will very likely also need to run SQL queries. So developing this feature just in the higher-up versions means a mismatch between the buyer’s needs & the pricing model.
Again this cuts into adoption & profits.
Sids Buyer Side Model
Sid Sijbrandjis has a very thoughtful perspective on pricing. It’s something you got to think about twice, but once you get it you’ll love it:
“Price for what the buyer wants, nothing more. Put everything else in the open-core”
If you check out the prices now, they seemed to realize that it’s just two buyers driving the adoption, it’s the individual developer who will set things up and test it out, for now, a price of 0, the manager who then buys it for the team (at 19$) and the executive who will roll it out for the whole company (for 99$).
If you look at the features, you’ll notice that it’s just about what the buyer cares about, nothing more!
There’s a genius in this, you still get the buyer to buy, but you don’t limit the adoption. As explained above, prefect actually includes more features than what a buyer would care about and thus limit the adoption, without gaining additional profits.
This model might need a lot of tweaking for your company. There are two things that you should consider:
- The buyer in B2B actually is not a single person. The usual model employed in B2B sales is the “buying-center side” model. Think hard about how your buyer personas look like.
- The buyer is not just “manager/executive/…” you could have a “health-care exec”,….
Let’s take a look at two examples that showcase these two variations.
Databricks “Buyer Side” Model
I don’t have a deep insight into Databricks pricing strategy, but they seem to have a very good understanding of their “buyer-side”. As far as I understand it, the Databricks product isn’t purchased by the manager. It’s either purchased by the director or the executive in a company. It’s a higher-level product, not like a GitLab or a dbt which also might be bought by one single product/ engineering manager.
In that spirit, Databricks offers three tiers, one which seems to be targeted at the director, two which target different types of executives, the typical executive and the hyper-security-sensitive executive (e.g. in the healthcare industry). That’s a great example of a deep understanding of the actual decision-makers and an according to pricing model.
Buyer Side+ Model
But if we look at the typical buying-center model, we should realize, buyers sometimes are not one-dimensional. One second dimension I would love to see in a lot of data-focused companies is the following one:
Almost all data companies try to push this into their one-dimensional pricing model, but they shouldn’t.
Just as Sid argues, GDPR compliance needs are really a reflection of the buying center. There is a gatekeeper, and in a lot of European companies, the legal department is an important gatekeeper. Your pricing model should reflect that, and allow an “opt-in” option because this is not dependent on who makes the decision to buy, but rather whether there is such a gatekeeper or not.
This gatekeeper might be there in companies of any size, so in my opinion, a second dimension is needed, not an offer “just for the big ones — because they are the only ones who care about security”.
“Circle The Wagons” Model
We’ve spent lots of time on pricing models and their effects on both commoditization and the open-source projects’ success.
I’d finally like to highlight one orthogonal move I particularly like. The company Automattic chooses to first focus on the high-end segment of hosting just like Dbt and GitLab do.
But they also develop “circle the wagons” products, products that simply only work in the cloud and as such are a bad fit for the open-core. Akismet, the spam-fighting plugin becomes better the more data it has, so it’s simply not useful to deploy it on your own.
In the same vein, JetPack is only suitable for the cloud, and Automattic will likely find many more products which are in this category. Have you thought about your “circle the wagons” products?
That’s a lot of prices! And yet, as outlined above, it’s really essential to understand what to put into the paid version of an open-core product and whatnot. So it pays quite literally to deeply understand the model one wants to pursue, the buyers, the buying center, and finally the real needs of the buyers.
Hope it helps!