AI Code Rescue · Case Study · Anonymised

We Fixed an AI-Built App: What We Found

An anonymised autopsy. The auth hole, the structureless database, the thing that died at ten users, and the code no human understood. The pattern, not the person. This is not a rare edge case.

Before you read: The villain in this story is the hype and the tool vendors, not the founder who got burned. They were sold something and told it was a development team. It was not. This is what we found, what it meant, and what it took to fix it.

How it started

A founder came to us with a SaaS application for managing client projects and invoicing. It had been built almost entirely using an AI coding tool. The founder was technical enough to read code but not experienced enough to know what bad architecture looks like until it is already causing problems.

The app had been live for about eight months. It worked. Clients were using it. There were forty-odd active accounts. Then a client mentioned, casually, that they could see another company's invoices in their dashboard. The founder called us that afternoon.

We agreed on an emergency audit. What follows is what we found. Everything has been changed to protect both the business and the individuals involved, but the technical details are accurate. We have seen variations of all of it, repeatedly, in AI-generated codebases.

What we found, and what it meant

Finding 1

The authentication hole: users could see each other's data

The application fetched records by their database ID without checking whether the logged-in user was allowed to see that record. Any authenticated user who knew or guessed the numeric ID of a record belonging to another company could retrieve it directly via the API. The IDs were sequential integers starting at one, so there was nothing to guess. You could enumerate them trivially.

This was not a subtle edge case. It was the fundamental access control check, missing from every data retrieval endpoint in the application. The AI tool had written the code to fetch data. It had not written the code to confirm you were allowed to see that data before fetching it.

Why AI tools do this: The AI was shown a schema and asked to generate CRUD endpoints. It generated them. It did not add authorisation checks because authorisation requires understanding the multi-tenancy model of the application, which requires understanding the business. The tool does not understand the business. It has never understood any business.
Finding 2

The database: no structure, no indexes, no relationships

The database had eleven tables. None of them had indexes on any column except the primary key. Foreign key relationships existed in the application code but not in the database schema itself, meaning the database could not enforce them and they were routinely violated. One table that stored line items for invoices had no relationship to the invoice it belonged to that the database knew about; the link was a column called invoice_id that was never indexed and could contain anything.

A query that listed all invoices for a client, with their line items, their payment status, and the associated client details, took twenty-three seconds on forty accounts. On four hundred accounts it would have been unusable.

Why AI tools do this: Generating a schema that works for a demo requires almost no database knowledge. Generating a schema that works at scale requires understanding normalisation, query patterns, indexing strategy, and what the application will need to do two years from now. AI tools optimise for the demo. They always have.
Finding 3

Concurrency: worked for one, died at ten

Several operations in the application were not atomic. Specifically, generating and assigning invoice numbers used a two-step process: read the highest existing number, add one, write the new invoice with that number. With a single user this works. With two users doing it simultaneously there is a race condition that produces duplicate invoice numbers. The founder had three clients who used the tool regularly and had already noticed occasional duplicate numbers, which they had assumed was a bug they had not yet reported.

It was not a bug in the traditional sense. It was a fundamental failure to understand that databases need transactions when multiple users might modify shared state concurrently.

Why AI tools do this: Concurrency is invisible in testing because you cannot test it by using the application yourself. The AI tool tested its output the same way: one simulated user at a time. Multi-user concurrent access is a real-world property that does not show up in a demo.
Finding 4

The maintainability problem: code no human understands

The application had grown through iteration. The founder had used the AI tool to add features over eight months by describing what they needed and accepting what the tool produced. The result was a codebase that had no consistent structure, no separation of concerns, and no documentation. Business logic was scattered across controllers, middleware, and database queries with no discernible pattern.

We asked the founder to walk us through what a specific calculation did. They could not. They had never written the code that did it; they had accepted the output the tool produced. The calculation was correct for the cases they had tested but broke for edge cases they had not considered, which the tool had no way of anticipating because it did not know the edge cases existed.

Why AI tools do this: They generate code to satisfy the prompt, not to be understood later. Consistency requires a design philosophy applied across the whole codebase. AI tools are stateless between prompts. They cannot maintain a design philosophy because they do not have one.

What it took to make it safe

Fixing it did not mean rebuilding from scratch. It meant fixing the specific problems in priority order, starting with the ones that could cause immediate harm. The order mattered. The auth hole was day one.

Week 1: Access control, end to end

Every data retrieval endpoint reviewed and updated to enforce ownership checks at the database level, not just the application level. The kind of fix that is straightforward once you understand the problem and terrifying when you realise it was missing.

Week 2: Database restructure

Proper foreign key constraints added. Indexes added to every column used in WHERE clauses or JOIN conditions. The invoice line-items table rebuilt with a proper relationship. The twenty-three-second query dropped to under two hundred milliseconds.

Week 3: Concurrency and transactions

All operations that modify shared state wrapped in proper database transactions. Invoice number generation replaced with a locked sequence that cannot produce duplicates regardless of concurrent access. The existing duplicate invoice numbers identified and resolved with the founder.

Weeks 4 to 6: Consolidation and documentation

Business logic consolidated into a consistent structure. The key calculations documented in plain language so the founder understood what their own software did. Tests written against the documented behaviour.

"The tool did not build bad software out of malice. It built software that looked correct in the only test it could run: does it work for one person in one scenario, right now. That is the demo. That is the only test AI tools can pass reliably."

The founder's product survived. It is running now with proper engineering underneath it. The clients who had been using it did not notice most of the changes. The duplicate invoice number issue required a conversation that was not comfortable, but it was not catastrophic.

The lesson is not "do not use AI tools." The lesson is: AI tools are good at writing code. They are not good at knowing what code to write. That judgement requires a human who understands the business, the users, and what the software needs to survive contact with reality.

If this sounds familiar, start with the audit

We will tell you in writing what you are sitting on: the security issues, the structural problems, the scalability risks. Fixed price, fast turnaround. Credited back if you proceed. It is the fastest way to know what you actually have.

Does your AI-built app need looking at?

The problems in this case study are not unusual. They are the default output of AI coding tools used without engineering oversight.

If you have an AI-built app in production, or in development, it is worth knowing what you are sitting on before your users find out.

  • Fixed-price AI code audit: security, structure, scalability
  • Clear written report on what you have and what it needs
  • Credited back if you proceed with a rescue
  • UK team, direct access to Rob and Jason

We reply within one working day.

Book a free consultation

Tell us about your app and we will be in touch within one working day.