research

Vibe Coding Is Not the Problem. Deploying Without Understanding Is.

Source/credit: Google

By Sandra Safari • May 12, 2026

The Red Access findings exposed roughly 5,000 leaky apps. The real story is not about AI writing bad code. It is about what happens when deployment becomes frictionless and responsibility disappears.

The internet had a minor panic recently. Israeli cybersecurity firm Red Access published findings that sent tech journalists scrambling: roughly 5,000 AI-generated web applications were found to be leaking sensitive data, including patient medical conversations, school lesson recordings, and internal staff schedules. The story spread fast, and the culprit was named quickly: vibe coding.

It is a satisfying villain. The term, coined by AI researcher Andrej Karpathy in early 2025, describes the practice of using AI tools to generate and deploy software with little to no traditional coding experience. Type a prompt, click deploy, watch your app go live. The mental image practically writes the headlines itself.

A Screenshot of X(formerly Twitter) by Andrej on Feb 2025

But here is what those headlines consistently miss: the problem was never that AI wrote the code. The problem is that an entire layer of architectural responsibility was abstracted away, and nobody noticed it was gone.

What Red Access Actually Found

To understand the findings clearly, it helps to look at the actual numbers. Red Access scanned approximately 380,000 publicly accessible assets created using popular AI-powered coding tools such as Lovable, Replit, Netlify, and Base44. Of those, about 5,000 apps exposed potentially sensitive information.

Red Access CEO Dor Zvi described the findings as part of research into "shadow AI," the growing trend of employees using AI tools without formal approval from their organizations. The exposed data ranged from patient conversations to internal business documents. Zvi called it "one of the biggest events ever where people are exposing corporate or other sensitive information to anyone in the world."

The companies behind the implicated tools pushed back. Replit CEO Amjad Masad noted that users choose whether their apps are public or private, and that the existence of a publicly visible app does not automatically constitute a security breach. That is a fair technical point, but it largely confirms the core issue rather than dismissing it: the users did not know that making something public meant exposing everything inside it. The configuration options existed. The understanding of what those options meant did not.

That gap, between a tool existing and a user understanding what it does, is where nearly every modern security disaster lives.

The Barrier That Dropped

Before the AI-assisted development surge, shipping a broken application to production required a meaningful amount of friction. A developer had to know how to configure a server, understand how databases handle access, write enough code to make the application function, and debug the inevitable errors that appeared during that process. None of those tasks guaranteed security, but they created natural checkpoints. You could not accidentally deploy a misconfigured database if you did not first understand enough about databases to set one up.

AI tools did not invent misconfigured databases. They did not introduce insecure APIs, exposed secrets in client-side code, or privilege escalation vulnerabilities. They automated the process of reaching production faster than any of those issues could be caught.

The numbers bear this out. Research from Apiiro found that repositories with significant AI-generated code showed a 322% increase in privilege escalation flaws and a 153% rise in architectural design flaws. These are not typo-level bugs that a syntax checker catches. These are structural decisions about who should be able to access what, baked into the application from the moment it is prompted into existence.

Meanwhile, a Veracode analysis covering over 100 large language models across 80 distinct coding tasks found that approximately 45% of AI-generated code introduces security vulnerabilities. The overall security pass rate remained flat at around 55% in Veracode's March 2026 update, unchanged despite continued improvements in raw coding performance benchmarks. The AI is getting better at writing code. It is not getting better at securing it, because security is not what most prompts ask for.

Why the AI Writes Insecure Code

This is the part of the conversation that tends to get lost in the noise about vibe coding being reckless or AI being dangerous. The insecurity is not a malfunction. It is a mirror.

Large language models learn to code by training on vast repositories of existing code. That existing code, accumulated over decades across GitHub, Stack Overflow, documentation, and open source projects, is full of insecure patterns. Developers have been writing SQL queries using string concatenation instead of parameterized inputs for thirty years. Hardcoded credentials have been appearing in version-controlled repositories since version control existed. If a meaningful percentage of the training data demonstrates an insecure pattern, the model learns that pattern as a valid way to solve that problem.

Georgia Tech's Systems Software and Security Lab launched the Vibe Security Radar project in 2025 to track exactly how many publicly filed CVEs can be traced to AI-generated code. Their early results were stark: 86% of AI-generated samples failed to defend against cross-site scripting, and 88% were vulnerable to log injection. Both of those vulnerability classes sit firmly on the OWASP Top 10 list of the most dangerous web application security risks. These are not edge cases. They are foundational web security concepts that have been understood for over two decades.

The AI is not hallucinating new categories of insecurity. It is faithfully reproducing the insecurity that already existed in the code it was trained on, at scale, and at a speed that makes manual review harder to maintain.

The "Vibe Coding" Scapegoat

There is a subtle gatekeeping dynamic embedded in how these incidents get reported, and it is worth naming.

When a major enterprise suffers a data breach, the coverage tends to describe it as a sophisticated cyberattack, a state-sponsored intrusion, or a complex supply chain compromise. When a solo developer's app leaks data after being built entirely with AI assistance, the story becomes about vibe coding culture, the recklessness of non-technical people shipping products, and the dangers of democratized software development.

In reality, the breach at a Fortune 500 company and the leak from a vibe-coded startup are often caused by the exact same underlying failures: misconfigured access controls, exposed API keys, absent row-level security policies, or a database that should have been private sitting in a publicly accessible environment. The difference is the post-incident framing, not the vulnerability class.

OWASP recognized this in 2025 by adding a dedicated category to its Top 10 specifically calling out vibe coding as a risk pattern. That recognition is useful. But the same OWASP Top 10 has listed injection attacks, broken access control, and security misconfiguration as dominant risk categories for years, all of which appear regularly in codebases written by credentialed senior engineers with no AI involvement whatsoever.

According to the Identity Theft Resource Center's 2025 Annual Data Breach Report, the United States recorded 3,332 data compromises in 2025, a new all-time record and a 79% increase over five years. That trajectory was established long before vibe coding existed as a term.

The Two Types of AI-Assisted Developers

The framing that "AI builds insecure apps" collapses a meaningful distinction that the actual security risk profile depends on entirely.

The first type of AI-assisted developer is someone with an existing security intuition who uses AI tools as a high-speed pair programmer. They use AI to generate boilerplate code, scaffold components, and move quickly through problems they already understand. But they are reading the diff. They are thinking about the threat model. They understand what trust boundaries mean and why database access should be scoped to the minimum necessary level. For this developer, AI is a productivity multiplier, and the 3.6 hours per week saved in developer time gets redirected into review, architecture, and decisions that require contextual judgment the model cannot provide.

The second type is what researchers have started calling "pure vibe coding": a user with zero security background who describes a product to an AI, receives a working application, and clicks deploy without reading a line of what was generated. This user does not know to ask whether the database is publicly accessible. They do not know what row-level security means. They cannot evaluate the access control logic because they do not know that access control logic needs to exist.

The tool itself is identical in both scenarios. The risk is entirely determined by the pilot.

A security scan in May 2025 of 1,645 web apps built on the Lovable platform found that roughly 10% had flaws exposing personal user data, specifically because creators relied on AI output without any review. That same year, 25% of startups in Y Combinator's Winter 2025 cohort reported codebases that were 95% AI-generated. Security researchers scanning close to 5,600 vibe-coded applications discovered over 2,000 vulnerabilities and more than 400 exposed secrets.

AI-written code is not inherently bad. Deploying it unverified is.

The Same AI Is Also Finding the Vulnerabilities

There is an irony worth sitting with here. The same class of AI models being blamed for creating insecure applications is being actively deployed by security researchers to discover zero-day vulnerabilities in enterprise systems written entirely by human engineers, sometimes years or decades ago.

Security teams at major firms are using large language models to analyze codebases for vulnerability patterns at a scale and speed that human reviewers cannot match. The AI does not discriminate between human-authored insecurity and AI-authored insecurity. It looks for patterns, and patterns are patterns regardless of who wrote the original code.

This cuts both ways. AI generating insecure code is a concern. AI finding insecure code at scale is an asset. The organizations that understand both dimensions are already building processes that use AI for security scanning precisely because of the volume of AI-generated code now entering their pipelines.

Where the Industry Actually Stands

The scale of AI adoption in software development is now substantial enough that the question has shifted from whether AI will be part of production codebases to how the resulting risk will be governed.

According to current data, approximately 41% of all code written globally is AI-generated or AI-assisted, a figure that was just 6% in 2023. The AI coding tools market reached $7.37 billion in 2025 and is projected to hit $30.1 billion by 2032. GitHub Copilot alone has over 20 million cumulative users and is deployed across 90% of Fortune 100 companies. The Sonar State of Code Developer Survey found that developers report approximately 42% of their code is currently AI-generated or assisted, and they predict that share will increase by more than half by 2027.

That trajectory creates a straightforward implication: the differentiator going forward is not who uses AI. Nearly everyone does. The differentiator is who understands the security architecture of what the AI produced.

The tools needed to close this gap exist. Static application security testing, or SAST, can scan AI-generated code before it reaches a pull request. Software composition analysis tools check dependencies for known vulnerabilities, which matters because AI models have training cutoffs and may recommend library versions that have since been flagged. Manual review of authentication flows, encryption implementations, and payment processing logic remains essential in any context where the AI cannot be trusted to understand regulatory requirements.

OWASP's guidance recommends prohibiting AI-generated code in high-risk areas, specifically authentication, encryption, and payment processing, without mandatory human review. That is not anti-AI. It is the same principle that requires a licensed architect to sign off on structural calculations even when the initial design was drafted by a junior engineer.

The Real Abstraction Problem

The deeper issue the Red Access findings surfaced is not that AI generates bad code. It is that AI tools have abstracted the deployment process in a way that also abstracted away the accountability that used to come with it.

When building a web application required setting up a server, configuring a database, writing authentication logic, and managing credentials, the difficulty of doing those things badly was itself a forcing function. Developers who got those things wrong saw the consequences quickly, learned from them, and developed instincts over time. The friction was not purely overhead. Some of it was formative.

No-code and low-code tools began eroding that friction years before large language models arrived. AI-assisted vibe coding accelerated the erosion dramatically. What results is a population of deployed applications built by people who have never had to think about whether a database connection string should be in client-side code, because they did not write the connection string, do not know it exists, and were never prompted to consider what happens if someone outside their organization finds it.

That is not a story about evil AI. It is a story about what happens when the speed of deployment runs ahead of the culture of accountability around what gets deployed.

The solution is not to slow deployment back down artificially. It is to build the accountability practices that frictionless deployment requires: automated security scanning baked into deployment pipelines, mandatory review checklists for AI-generated code that touches user data, and a shared understanding across teams that "it works" and "it is safe" are two different criteria that need to be evaluated separately.

What Responsible AI-Assisted Development Looks Like

Practically speaking, organizations and developers building with AI tools in 2026 need to treat security as an explicit requirement in every prompt, not an assumed default. Asking an AI to "build a user authentication system" will produce a working authentication system. Asking it to "build a user authentication system with secure password hashing, rate limiting on login attempts, and no exposure of session tokens in client-side storage" produces a materially different output.

Review every piece of AI-generated code that touches authentication, data storage, third-party API calls, or user-facing inputs before deployment. Use free tools like OWASP ZAP or integrated scanning in your deployment pipeline. For any application handling personal data, medical records, financial information, or regulated data, professional security review is not optional.

The Red Access findings are a useful reminder of what the absence of that review looks like at scale. Five thousand applications, many of them built by people who genuinely did not know their users' data was accessible to anyone on the internet. Not malice, not negligence in the conventional sense, but a system where the ability to deploy arrived before the knowledge of what responsible deployment requires.

That imbalance is correctable. The tools, the knowledge, and the awareness are all available. The question now is whether the culture of building with AI catches up to the culture of deploying with it.

Comments

to join the discussion.