The Data AI Can't Copy Just Raised $40 Million

In late May 2026, CVS, the U.S. healthcare company, invested $40 million in a data startup called H1. The news landed at a moment when SaaS startup funding had noticeably cooled in the wake of the AI boom, and it drew attention both inside and outside the industry. But more striking than the size of the deal was a single line CEO Ariel Katz offered as he explained it.

"AI can replicate workflow SaaS.
But it cannot replicate the proprietary physician data H1 holds."

That sentence compresses, in just a few words, a shift now playing out across the software industry. In an environment where AI tools are spreading fast, the line between what holds its value and what doesn't is coming into sharper focus. And by spending $40 million, CVS confirmed that the line sits in a surprisingly simple place.

What AI Catches Up To Fast—and What It Doesn't

H1 is a data company that compiles information on U.S. medical professionals and supplies it to pharmaceutical firms, biotechs, and medical-device companies. It delivers, in refined form, each individual physician's specialty, prescribing patterns, academic publication record, conference-presentation history, and standing within a hospital's structure of influence. It offers sales-automation and marketing features too, but the asset the company actually leads with isn't those features—it's the data itself.

There's a reason this distinction has come to matter lately. As general-purpose AI models like GPT-4o and Claude have taken hold, the forecast that "AI will soon replace most workflow software" has been edging toward reality. Tasks like scheduling, drafting emails, summarizing contracts, and tidying up meeting notes are already being handled by AI in large part. The so-called tool layer is being commoditized fast, and the time it takes to build any single workflow feature has shrunk to a degree that's hard to compare with the past.

H1's physician database, by contrast, sits on a different plane from this trend. Information like the prescribing histories of doctors across the United States, their academic networks, and their decision-making roles inside hospitals can only be assembled by collecting individual records over years and verifying their consistency. It can't be gathered overnight, it comes with legal constraints, and its credibility—the shorter the accumulation period, the lower the adoption rate in the field. When CVS invested $40 million, it was betting on that accumulation process itself.

Why the Moat Is Shifting from Features to Data

Before AI, a software product's competitiveness was often explained in terms of feature advantage: a more intuitive UI, more integrations, faster processing. That logic began to wobble once connecting a general-purpose AI through a single API made it possible to catch up on much of a feature set within weeks. Startup or solo developer alike, anyone drawing on open-source models found that feature work that would once have taken years now took weeks.

In business strategy, the key yardstick for explaining "sustainable competitive advantage" is the cost of imitation: how much time and resource it would actually take a competitor to copy your strength. The higher that cost, the thicker the moat; the lower it is, the faster the position erodes. In H1's case, the thickness of the moat is the accumulation period of the physician data itself. A feature advantage can be matched by a competitor within six months. Three years of clinical and prescribing data takes three years.

You can see this structure in domestic cases as well. One reason a real-estate information platform earns more trust than the big portals for certain uses isn't a difference in features but the density of actual-transaction data that users entered and verified themselves over years. A fashion e-commerce player has held its ground against the large platforms because fashion-specific reviews and sizing data piled up through community activity over a long stretch of time. Even when the features look similar, a difference in data density changes how it feels to use. I see this not as a mere story about a tech trend but as a fundamental design question for running a business.

Even So, "Have the Data and You'll Survive" Is an Overstatement

Here we have to face the counterargument honestly. H1's logic is persuasive, but stretch it into a general formula—"own proprietary data and you'll keep your competitive edge even in the AI era"—and there are points to be careful about.

First, having data without the capability to use it just leaves it sitting idle. Some large Korean companies hold years of customer data yet, in more than a few cases, have failed to use it to build new services or improve existing decisions. Accumulating data and refining it into something genuinely valuable are entirely different kinds of capability.

Second, the data-moat strategy isn't a path immediately open to a solo founder just starting out. It's hard for a small team to assemble, in a short time, anything resembling the physician database H1 built over many years. This is a defensive logic that works for someone who has already accumulated data—not necessarily an offensive strategy that applies right away to someone just entering.

Third, as privacy regulation tightens, the cost and legal risk of building and maintaining proprietary data rise along with it. There's no guarantee that a model H1 built inside the particular regulatory structure of the U.S. healthcare market works the same way in another country or another industry.

Even so, the issue is hard to push fully aside. When AI is rapidly eroding the tool layer, if the value you provide lies only in features and workflow, that ground keeps narrowing.

The Question Left for the Solo Operator

When we carry H1's case over to the context of a Korean solo founder or small-scale practitioner, there are a few categories worth examining.

Relationship data. Information about a particular client contact's decision-making style, the report format they prefer, and who actually holds influence within an industry. If it lives only in your head or sits fragmented across a notes app, it can't be reused on the next project. Recorded systematically, it becomes a starting point a competitor in the same industry doesn't have.

A collection of domain-specific cases. The patterns you've observed repeatedly within a specific industry, region, or customer group. Information like "the common features of the profit-and-loss structure in the first three months after a mid-sized independent café opens" or "in B2B sales, companies whose approval chain runs more than three levels deep take an average of six extra weeks on early decisions" isn't in a general-purpose AI's training data. As this accumulates, the very nature of your evidence—whether for consulting or for content—changes.

The verbatim text of customer feedback. Not the average score on a survey, but the sentences customers wrote themselves. As lines like "at first I got confused here, and this was why" pile up, you gain a different basis for deciding how to improve a feature and which direction to take your marketing. Averages make you lose your bearings; the original text helps you find them.

The premise behind all of this is recording. However good the relationships and experience, if they aren't recorded, they can't be reused the moment your team grows or you connect an AI tool. That's why the idea of redesigning business operations as a process of accumulating data carries weight in this context.

The logic H1's CEO offered to explain the CVS investment is simple: the moat is not software features but accumulated data. In the business you run today, where is the spot a competitor can't immediately follow even after putting in the time? For all you invest in features and tools, are you investing just as much in records and data?