The Fog of Code

State of 2025

Dec 28, 2025

Andrej Karpathy recently confessed to feeling more behind than ever as a programmer. This is a man who built neural networks at Tesla, co-founded OpenAI, and can derive backpropagation on a napkin. If he feels behind, what about the rest of us?

Andrej Karpathy@karpathy

I've never felt this much behind as a programmer. The profession is being dramatically refactored as the bits contributed by the programmer are increasingly sparse and between. I have a sense that I could be 10X more powerful if I just properly string together what has become

5:36 PM · Dec 26, 2025 · 13.8M Views

2.4K Replies · 6.61K Reposts · 50.9K Likes

The confession resonated because it’s true. But I think the anxiety is misplaced. What’s actually happening is more interesting—and more revealing about how industries operate under uncertainty—than any individual’s struggle to keep up.

I.

Here’s what I keep noticing: code can now be produced faster than a human can think and comprehend. This is new. For the entire history of programming, the bottleneck was always writing. Now writing is cheap. Understanding is the bottleneck.

The Cursor CEO warns that “vibe coding” builds shaky foundations—you ask AI to write code, you don’t look at it, you add another floor. And another. And another. Until the building collapses.

https://www.financialexpress.com/life/technology-cursor-ceo-warns-against-vibe-coding-says-ai-written-code-creates-risks-makes-software-to-crumble-4090090/

He’s right. But he’s also selling shovels during a gold rush. Cursor just raised $2.3 billion at a $29 billion valuation. Their business depends on the fundamental bet that this uncertainty is navigable—that there’s a middle ground between writing every line yourself and closing your eyes entirely.

This is what I want to examine. Not whether AI coding is good or bad, but something more interesting: how industries make money while operating in fog.

II.

Let me tell you about oil exploration.

You drill a hole in the ground. Maybe there’s oil. Maybe there isn’t. The geology gives you hints, but fundamentally you’re placing bets with incomplete information. The industry calls these “wildcat wells”—exploratory drilling in unproven areas. The success rate historically hovered around 10%.

Here’s the thing: ExxonMobil is still worth $450 billion. The industry figured out how to be profitable despite the uncertainty. How?

First, they got honest about what they didn’t know. Reserve estimates come in two flavors: deterministic (single best guess) and probabilistic (range of outcomes with probabilities). The industry learned that single-point estimates are dangerous. The range tells you more than the number.

Second, they developed heuristics. Not because heuristics are perfect, but because you need some decision framework when operating in fog. Good enough beats paralysis.

Third, they priced the uncertainty into everything. Oil isn’t priced cost-plus—you can’t pass on your drilling failures to customers. So you’d better have a capital structure that survives the dry wells.

Fourth—and this is crucial—they built portfolios. Any individual well might fail. But a portfolio of wells, with uncorrelated risks, produces predictable aggregate outcomes. The variance averages out.

Software development is just starting to learn these lessons.

III.

The METR study surprised everyone. Experienced developers, working on codebases they’d contributed to for years, using state-of-the-art AI tools. The result: they were 19% slower with AI assistance.

But here’s the part that matters more: they thought they were 20% faster.

This is the most dangerous kind of uncertainty—the kind you can’t see. The developers felt productive. They had that sensation of progress, of code flowing onto the screen. But when measured objectively, they were slower.

Why? The study suggests several reasons: prompting overhead, time spent reviewing and fixing AI suggestions, the cognitive load of evaluating code you didn’t write. But I think there’s something deeper.

AI coding tools hijack the brain’s reward system. You get the dopamine hit of seeing code appear. You feel like you’re building. The subjective experience of productivity diverges from actual productivity.

This isn’t new. Casinos figured this out decades ago. So did social media. The feeling of progress is chemically addictive in ways that make it a terrible indicator of actual progress.

IV.

Meanwhile, a parallel universe of startups is emerging. They promise to solve the uncertainty.

AI code review tools. AI testing tools. AI security scanning. AI CI/CD optimization. Each one claims to be the middleware that makes AI-generated code safe to ship.

Step back and notice what’s happening. You have non-deterministic systems (LLMs) generating code, and you’re building more non-deterministic systems to validate that code, and then more systems to validate the validators. It’s turtles all the way down.

This is the same mistake the financial industry made with CDOs. You take uncertain assets, package them, rate the package as less uncertain, then package the packages, rate them again. At each layer, someone is getting paid to issue a stamp of approval. And at the end, you’ve built a tower of correlated risks masquerading as diversification.

I’m not saying AI coding tools are going to cause a financial crisis. I’m saying the structure is similar: complexity that feels like safety, intermediaries extracting value at each layer, and fundamental uncertainty that never goes away—it just gets hidden.

V.

Vercel recently published something that cut through my cynicism. They’d built a sophisticated internal agent with specialized tools, heavy prompt engineering, careful context management. It worked... kind of. But it was fragile, slow, and required constant maintenance.

So they did something counterintuitive. They stripped the agent down to a single capability: execute bash commands. Let Claude read files using grep, cat, and ls. The Unix tools your grandfather used.

The result? 100% success rate instead of 80%. Fewer steps, fewer tokens, faster responses. All by doing less.

I keep thinking about this. The most sophisticated approach—custom tools for schema lookup, query validation, error recovery—was beaten by the simplest one. Give the AI access to files and get out of its way.

This feels like a lesson. Maybe the right response to uncertainty isn’t more tools. Maybe it’s fewer.

VI.

There’s a pattern I’ve noticed across industries that successfully operate under uncertainty.

Aviation has checklists. Not because pilots are stupid, but because memory is unreliable and the stakes are catastrophic. The checklist is a forcing function for thoroughness, not a substitute for skill.

Medicine has protocols. When you arrive at an ER, triage follows a flowchart. Not because doctors lack judgment, but because systematic evaluation catches things that intuition misses.

Insurance has actuarial tables. They don’t know which house will burn down, but they know what percentage will. The uncertainty at the individual level becomes certainty at the portfolio level.

What do all these have in common? They accept uncertainty as fundamental, then build systems that produce reliable outcomes despite it.

Software development is still in denial. We want to believe we can eliminate uncertainty with better tools, better tests, better processes. AI coding tools are the latest incarnation of this fantasy.

VII.

Let me be concrete about what I mean.

If you’re building a prototype, vibe code away. The uncertainty there is about whether the idea works, not whether the code is correct. Ship fast, learn fast, delete fast.

If you’re building production systems, you need a checklist. Not for every line of code, but for the decisions that matter: security boundaries, data flows, failure modes. AI can help you generate code, but it can’t replace the discipline of knowing what questions to ask.

If you’re reviewing AI-generated code, budget time for understanding, not just scanning. The METR study found developers spending much of their time “cleaning up” AI output. This isn’t a bug—it’s the work. Understanding code you didn’t write is always slower than understanding code you did.

If you’re building a company in this space, be honest about what layer of the stack you’re in. Are you helping developers make better decisions? Or are you adding complexity that feels like safety while deferring uncertainty to the next layer?

VIII.

Now let me talk about something I find genuinely novel. I’ve watched people code in ways that would have seemed impossible two years ago.

A designer who’d never written JavaScript built a fully functional web app. A product manager prototyped backend APIs to test an idea. A teenager built a game over a weekend.

This is real. The floor has dropped on what it takes to build software. And it happened essentially overnight.

The question is what comes next. I can see two trajectories.

In one, we get a Cambrian explosion of software. People who never would have built things start building. New ideas get tested. Some fail, most fail, but the ones that work reshape industries. This is what happened when publishing became cheap, when video production became cheap, when app development became cheap.

In the other, we get a tower of Babel. Code nobody understands, dependencies nobody audited, systems nobody can debug. Technical debt accruing faster than anyone can pay it down. And then, one day, something important breaks in a way that matters.

I don’t know which trajectory we’re on. Maybe both—the explosion for some things, the collapse for others.

IX.

Here’s what I believe.

The uncertainty is real and it’s not going away. AI systems are stochastic. They hallucinate. They produce code that passes tests but contains subtle bugs. No amount of tooling will make them deterministic.

The productivity illusion is real too. When developers think they’re 20% faster but are actually 19% slower, that’s not a rounding error. It’s a systematic bias that will distort every decision made under its influence.

But the potential is also real. The Vercel result—doing less and getting more—points toward something. Maybe the answer isn’t more tools to manage AI. Maybe it’s finding the right level of abstraction where AI help is genuinely helpful, and doing the rest yourself.

The industries that thrive under uncertainty share a common trait: they’ve stopped trying to eliminate uncertainty and started trying to make good decisions despite it. They price risk explicitly. They build portfolios. They use checklists. They measure what matters.

Software development hasn’t internalized this yet. We’re still in the phase where every new tool promises to make the uncertainty go away. It won’t. The sooner we accept that, the sooner we can build systems that work anyway.

X.

If you’re selling tools in this space, sell clarity, not magic. The best products help people understand what they’re dealing with, not hide it.

If you’re buying tools, be skeptical of anything that claims to solve a problem that’s fundamentally unsolvable. The question isn’t whether AI-generated code has bugs. It does. The question is whether your process catches them.

If you’re coding, invest in understanding. The bottleneck shifted from writing to reading. Act accordingly. The developers who thrive won’t be the ones who generate the most code. They’ll be the ones who understand what they’re building.

And if you’re feeling behind, like Karpathy, take comfort in this: everyone is behind. The people who seem ahead are mostly just noisier about their confusion. The honest practitioners are all feeling their way through the fog, same as you.

The profession is being refactored. The bits contributed by the programmer are increasingly sparse. But the judgment about which bits matter? That remains irreducibly human.

Roll up your sleeves.

Primary source: https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/

ArXiv paper: https://arxiv.org/abs/2507.09089

The key findings: 16 experienced open-source developers completed 246 real tasks on repositories they’d contributed to for years (averaging 5 years, 1,500 commits). When allowed to use AI tools (primarily Cursor Pro with Claude 3.5/3.7 Sonnet), they were 19% slower—while believing they were 20% faster.

The perception gap is the most striking part. Before starting, developers predicted AI would make them 24% faster. After finishing (with objectively slower results), they still estimated AI had sped them up by ~20%.

Worth noting METR’s own caveats: they don’t claim this applies to all developers or all contexts. Less experienced developers, unfamiliar codebases, or different task types might show different results. But for senior engineers working in code they know deeply, the current tools added friction rather than removing it.