Vibe Coding Broke Its Promise

Spend an hour on any builder forum and you’ll find the same confession written a hundred different ways. Someone built their app in a weekend with an AI tool. It worked. They shipped it. Now it’s Monday and the auth is broken, the database is silently dropping rows, and the bug nobody can reproduce is the one that’s costing them customers.
The dream we sold each other six months ago is showing up at the door asking for its money back.
I want to be careful about how I say this, because I don’t think the people who built or used these tools were wrong to be excited. The leap was real. Watching a working interface materialize from a paragraph of plain English is one of the genuinely magical experiences of the last decade of software. I felt it too. We all did.
But somewhere between the demo and the deployment, a quiet substitution happened. We started calling prototypes products. We started calling demos software. And the bill for that confusion is now coming due.
The Misdiagnosis
The most common explanation I see for why this is happening blames the model. The AI isn’t smart enough yet. It hallucinates. It picks the wrong libraries. It writes code a senior engineer would catch and rewrite.
That explanation is comforting because it suggests a fix that’s already on the way. Wait six months. The next model will be better. Eventually the gap closes and everything works.
I don’t believe that. And I don’t believe it because the failure mode I keep seeing has nothing to do with how good the code is.
Microsoft reported on its 2025 earnings call that roughly 46% of all code committed by active GitHub Copilot users is now AI-generated. Around the same time, application security firm Veracode published research finding that AI-generated code introduced security vulnerabilities in about 45% of the samples it tested. Those numbers will get worse before they get better, and a smarter model will not fix them.
The model isn’t the problem. The process is.
What’s Actually Missing
Walk me through how a vibe-coded app gets built and tell me where the architecture decision happens.
You describe what you want. The AI generates an interface and some code behind it. You look at the interface, you click around, it does roughly the thing you asked for, and you call it done. At no point in that loop did anyone — human or machine — stop and define what was actually being built.
There is no schema. There is no data model. There is no list of states the system can be in or a definition of what counts as valid. There is no contract between the front end and whatever is pretending to be a back end. There is no decision about what happens when the user does something the generator didn’t anticipate, because nobody anticipated it.
What got built is a thing that looks like the thing you asked for, on the specific path you happened to walk through it on the demo. Step off that path and the whole structure reveals itself as scaffolding. There was never a building underneath.
This is not a failure of intelligence. It’s a failure of definition. And no amount of additional intelligence applied to an undefined problem will produce a defined result. It will just produce a more convincing version of the same scaffolding.
Let me get specific, because abstractions are how this conversation keeps going in circles.
Authentication is not a feature you add later. It is a decision about who your users are, what they can see, and what trust boundary your application sits on. Bolting it onto a vibe-coded app two weeks after launch is the software equivalent of installing a front door on a house that was built without walls.
A database schema is not something an AI should be guessing at while it generates the form that writes to it. The schema is the spine of the application. Every decision downstream of it — what you can query, what you can index, what you can change later without breaking everything — is constrained by choices that were made or not made up front. When the schema is improvised, every future change is a renovation.
An API contract isn’t optional. The moment your app talks to anything else — a payment processor, an email service, another piece of software, an AI agent — there has to be a defined surface. Without one, integrations become a series of one-off hacks that nobody can maintain and nobody wants to inherit.
These aren’t advanced topics. They are the table stakes of building software that lives longer than its first weekend. And they are exactly the things that get skipped when the entire build process is describe it, see it, ship it.
The Rule AI Didn’t Change
I don’t think anyone set out to build an industry that ships fragile software. I think the tools that emerged in this category optimized for the moment that sells the tool — the moment of magic, when an idea becomes a working screen in under a minute.
Time-to-magic became the measurable thing. Time-to-production was somebody else’s problem.
That’s a reasonable optimization for a demo. It is a terrible optimization for a category of software that’s now responsible for shipping real applications to real users with real money on the line. It leaves an entire generation of builders stranded at the 90% mark, with a thing that works on their screen and falls apart everywhere else.
The dirty secret of this category is that the easy part was the first 90%. The next 9% — making it actually work for more than one user, on more than one device, under conditions you didn’t anticipate — is harder than the first 90% combined. And the last 1%, the part that distinguishes a working application from a fragile one, is the part that requires you to have known what you were building before you started.
Here’s the thing nobody wants to hear, because it sounds like a step backwards in a moment that’s supposed to be all forward motion.
The best software has always started with a definition. Architecture before code. A clear model of the problem before any line is written. This was true when teams of fifty engineers built systems by hand, and it’s true now when one person and a model can build the same system in a weekend.
AI didn’t change that rule. AI made it more important, not less.
When the cost of producing code approaches zero, the cost of producing the wrong code also approaches zero. Which means the only thing that still has cost is figuring out what the right code would have been. That work — the definition work, the architecture work, the part where you decide what you’re actually building before you start building it — is the only part that hasn’t been commoditized.
It’s also the part the current generation of tools skipped.
I don’t think the answer is to slow down. I don’t think the answer is to go back to writing everything by hand. The leap was real, and the leap is staying.
The answer is to build the definition step into the loop. Not as a manual gate that slows you down, but as the actual foundation the rest of the work sits on. The people figuring out how to do that are working on something quieter than what you’ve been seeing. They are about to make the loud version of this category look like what it always was.
A working screen was never the same thing as a working system. We’re all about to remember why.


