The Model Labs Are All Agent Labs Now

As recently as early 2024, "agent" was the kind of word that only showed up on a startup's pitch deck. By the first half of 2025, the picture had changed. On the same day that Latent Space publicly declared that "every model lab is now an agent lab," the number-one project on GitHub Trending was multica, an open-source effort to run AI agents like actual members of a team. That same day, Naver and Kakao each announced through their own public channels that they had deployed multi-agent AI internally. In under twelve months, a concept that lived on pitch slides had made its way into the internal systems of Korea's largest companies.

If you file this away under "AI moves fast," you miss the point. What's happening now isn't a tool getting better at its job—it's a shift in the basic unit of how work gets done. For solo operators and small teams, that shift amounts to a reset of the competitive playing field.

What it really means when agents go live inside a company

"Agent lab" can sound like a slogan, so it's worth spelling out what it actually means. The job of a traditional AI model lab was to train bigger, more accurate language models. GPT-4 was followed by GPT-4.5; Claude 2 was followed by Claude 3. Benchmark performance was the yardstick of competition.

An agent lab solves a different problem. Rather than asking how smart a model is, it designs which tools a model uses, in what order it executes tasks, and under what conditions it hands work off to another agent. The competitive frontier has moved from the internal architecture of a single model to the structure of collaboration between models.

The fact that an open-source project like multica climbed GitHub Trending shows how far this trend has democratized. When agent infrastructure first appeared, only large cloud companies and well-funded startups could touch it. In the next phase, the tooling went open source. Now technically capable companies like Naver and Kakao are moving into real internal deployment. The crucial detail is that all three stages happened in quick succession. That's the signature of diffusion entering an accelerating phase.

NVIDIA's recently unveiled diffusion language model (Diffusion LM) connects to this trend on another level. Conventional language models use an autoregressive approach, generating text one token at a time in sequence. NVIDIA's experimental approach explores converting that process into parallel decoding to dramatically increase generation speed. The more agents you run, the more inference speed and cost become the bottleneck. If one agent has to finish before the next can start, throughput grows by addition, not multiplication. Once parallel decoding becomes practical, that bottleneck shrinks sharply, and the entire cost structure of running agents changes.

From using tools to designing a team

It's worth looking closely at why this matters directly to a solo operator or a small team.

Until now, using AI tools has mostly been a question of "what prompt produces a better result?" You feed ChatGPT a more precise instruction, pull a better draft out of Claude, generate an image with Midjourney. A person pulls out one tool at a time and personally carries each result forward to the next step.

The agent paradigm flips that structure. The person sets the goal and the context, and the agents distribute, execute, and review the work. A research agent gathers data, an analysis agent extracts the patterns, a writing agent produces the draft, a review agent catches the errors. The person moves up from being the one who executes each step to being the one who designs the flow.

This isn't simply a story about things getting more convenient. A question long overlooked in career planning becomes important again: "What judgment am I actually exercising in this work?" The more the hands that produce the deliverables get replaced by AI, the more a person's real value comes down to the quality of their judgment and their grasp of context. When you treat a job purely as a place to survive, that judgment muscle never gets properly trained. In the agent era, the competitive person isn't the one who knows the most, but the one who can tell what to delegate from what to handle personally.

There's another point worth noticing. Once the agent layer goes open source, the range of work a single operator can do—assembling a team of agents to handle what used to take a company a team of dozens—expands considerably. The fact that Naver and Kakao's deployments are happening at the "enterprise level" doesn't make this irrelevant to a solo operator. The open-source version is already up on GitHub.

What a solo operator should check right now

So how do you connect this trend to your own day-to-day work?

Map out the flow of your repetitive tasks. List the work you repeat every week and every month, and turn it into a flowchart showing the order each task follows. What agents replace best are tasks with a clear shape: take an input, process it, produce an output. Once you have that flowchart, you start to see which agent could attach to which step.

It's time to test a multi-agent open-source tool yourself. Beyond multica, multi-agent frameworks like AutoGen (Microsoft), CrewAI, and LangGraph are already available as open source. There's a difference in feel between actually using these and simply reading about them in the news. You don't need to read the implementation code. Just grasping the design—which role gets assigned to which agent—is useful enough on its own.

Deliberately mark out the work you can't hand to an agent. Building trust with a customer, the judgment to read a situation's context, setting priorities when the criteria are ambiguous—these are hard for an agent to do. You need a clear sense of where in your current work this kind of judgment lives, so that your role stays well-defined even after you bring agents in.

Monitor the cost structure. If NVIDIA's diffusion LM experiment reaches a practical stage, API call costs could drop below where they are today. If you're holding off on running agents purely for cost reasons, it's reasonable to check the cost trend on a six-month cadence. Cloud inference costs have fallen dramatically over the past two years alone.

Invest in learning agent design. If prompt engineering was the core skill of 2023, the core skill of the second half of 2025 is agent orchestration—who does what, and when to ask a human to review. This is closer to a sense for designing work than to technical knowledge. The person who knows their own work best is the one who can design it best.

On the day the declaration came that the model labs had become agent labs, an open-source project hit the trending list and a major Korean company announced an internal deployment. The convergence of these three signals on a single day was no coincidence. The gap between the person who observes this trend now and redesigns their own workflow, and the person who follows along once it has settled, will show up not as a technology gap but as a gap in design sense.