Nvidia's New Strategy for Owning the AI Ecosystem

Two trends are unfolding in the AI market at the same time — and they point in such different directions that it's hard to believe they're describing the same industry.

On one side, Nvidia has poured $40 billion into AI equity stakes this year alone — roughly 60 trillion won. A company that sells GPUs is now buying stakes directly in the companies that buy those GPUs. On the other side, a 35-billion-parameter (35B) model is running at 80 tokens per second on an ordinary 12GB gaming graphics card. What required data-center-grade hardware just a year ago is now happening in real time on a gamer's desktop PC.

Big Tech is locking down the top with capital, while open source is boring in from the bottom with efficiency. Everyone else is caught in between.

Nvidia's Transformation: From GPU Maker to AI Capitalist

$40 billion is more than double South Korea's entire annual government R&D budget. Nvidia used that money to buy direct equity stakes in AI startups and infrastructure companies.

Why does this matter? Until now, Nvidia's business model was simple: make GPUs, sell GPUs. That was it. Now Nvidia holds equity in the companies buying its GPUs — and funds them so they can buy even more.

This isn't ordinary investing. It's a closed loop where Nvidia's money flows back to Nvidia.

Nvidia invests

→

Startup grows

→

Startup buys GPUs

→

Revenue rises

→

Nvidia invests more

Nvidia's Self-Reinforcing Investment Loop

This structure produces two effects. In the short term, it accelerates Nvidia's revenue and market value. In the long term, it locks the entire AI infrastructure ecosystem into Nvidia's standards. Even if AMD or Intel try to catch up later, breaking into an ecosystem already woven together by Nvidia's capital is a tall order.

The key shift: a GPU maker has recast itself as the capitalist behind the entire AI ecosystem — a strategy to lock down the market's infrastructure layer through sheer financial firepower.

Open Source Strikes Back: A 35B Model Running on a 12GB GPU

At the very same moment, the exact opposite is happening.

DeepSeek V4 released its FP4 quantization technique (QAT). Quantization lowers the numerical precision an AI model uses, shrinking its memory footprint — and a smaller model can run on a smaller GPU.

Qwen3.6 uses MTPMulti-Token Predictioninference to run faster on the same GPU. Instead of generating one token at a time, it predicts several tokens simultaneously.

Combine the two techniques and the result is startling: 12GB of VRAM — about what a standard gaming RTX 4070 has — is enough to run a 35-billion-parameter model at 80 tokens per second. That's faster than a human can read.

Just last year, running a 35B model required a data-center GPU like the A100 — north of $15,000 apiece. Now the same job runs on a gaming graphics card that costs around 500,000 won (roughly $360).

The implication is clear: the pool of people who can run AI is about to explode. Even without the money to rent a data center or cover cloud costs, anyone will soon be able to run AI directly on their own PC.

What a 40% Crash in NAND Prices Is Signaling

One more intriguing data point has entered the picture. South Korea's spot price for NAND flash memory plunged 40% in a single month — the first reverse signal since the "AI memory supercycle" narrative began.

At its simplest, a falling NAND price means supply is outrunning demand — the opposite of the story that AI data centers are hoovering up all available memory.

There are two possible readings.

First, an early warning sign of AI infrastructure overinvestment. Memory makers may have ramped up production aggressively, assuming everyone was building AI data centers — but actual data center deployment may not be keeping pace. That's a decoupling between quoted prices and real demand.

Second, a sign that AI is evolving to need less memory. As efficiency techniques like FP4 quantization become standard, the memory required to run the same model can drop by more than half. For data center operators, that means doing the same work while buying less memory.

Either way, the simple story that "AI demand for memory will grow without limit" is starting to break down. The NAND price is the market's first crack.

How to Survive Being Sandwiched

Here's how the current market structure breaks down.

The top: Nvidia, OpenAI, and Big Tech. They monopolize capital and infrastructure while building closed ecosystems. Every API call costs money, and that money flows up to Big Tech.

The bottom: open source plus efficiency technology. A 35B model runs on a 12GB GPU. No API calls, no cloud dependency — AI running on your own computer.

The middle: everyone else, squeezed in between. Relying on Big Tech APIs means ballooning costs; building your own infrastructure takes capital you don't have; running open source yourself takes technical talent you don't have.

The strategy for surviving this structure is to use both ends at once.

Strategy 1: Reserve Big Tech APIs for what actually matters. Don't throw every task at Claude or GPT — use them only where real reasoning power is required. Handle simple classification, summarization, and keyword extraction with open-source models instead. Costs drop to a tenth.

Strategy 2: Take advantage of the falling barrier to running your own infrastructure. A year ago, building in-house AI infrastructure was the domain of large corporations. Now, gaming-PC-grade hardware can do meaningful work. Even small companies can run their own AI systems without ever sending sensitive data outside.

Strategy 3: Diversify your dependencies. Locking into a single Big Tech API leaves you exposed to price hikes, policy changes, and service outages alike. Use Claude, GPT, and Gemini for different kinds of tasks, and run some workloads on self-hosted open source. This is the "efficiency path" David George described as one of the "two roads of the AI era."

Why Robotics Is Moving First

Another notable announcement landed the same day. Nvidia unveiled its GR00T robotics model, while a separate study titled "LLMs corrupt documents" was published almost simultaneously. That these two events happened on the same day is telling.

The research on LLMs corrupting documents underscores, once again, the reliability limits of text-based AI agents. It means there's still a lot to verify before handing office work over to AI.

Robotics models, meanwhile, are advancing fast. It's a signal that AI could replace human labor faster in physical work, manufacturing, and logistics than in office jobs. Before the reliability problems of text are solved, the domain of physical action is already opening up first.

This trend has direct implications for Korea's manufacturing and logistics industries. Office automation may be arriving slowly, but physical automation could arrive faster than expected.

The Signals Worth Watching

Taken together, today's news points to three signals.

First, AI infrastructure investment has become a game of pure capital. $40 billion isn't a scale any individual company can match. Competition at the hardware layer should be considered effectively over. The real question is what to build on top of it.

Second, efficiency is the new differentiator. While Big Tech locks down the top with capital, open source is prying open the bottom with efficiency. For any company adopting AI, the ability to build a cost-efficient operating structure becomes a core competency.

Third, the AI memory supercycle won't last forever. The 40% drop in NAND prices may not be a one-off — it could be the first signal of a broader trend. The era of simply betting on AI infrastructure across the board is over. What's needed now is the judgment to tell which areas are genuinely growing and which are overbuilt.

Big Tech locking things down from above, open source boring in from below — that's the sandwich we're all in now, and it's time to decide where we stand within it. Climbing up to play the capital game isn't an option. That leaves two paths: carve out a position at the bottom using efficiency as your weapon, or cleverly combine both ends to fill the space in between.

Today's market is telling us that the window to find that path is closing.