AI-generated code contains 1.7x more bugs and severe defects than human-authored code

2 Sources

Share

A CodeRabbit study analyzing 470 open source pull requests found that AI-generated code contains significantly more defects than human-written code. AI-authored PRs averaged 10.83 issues compared to 6.45 in human PRs, with 1.4x more critical issues and heightened security vulnerabilities including XSS and password handling flaws. The findings raise questions about code quality as companies like Microsoft report 30% of their code now comes from AI tools.

AI-Authored Code Contains Worse Bugs Across Multiple Categories

AI-generated code introduces substantially more problems than human-written alternatives, according to a comprehensive CodeRabbit study examining 470 open source pull requests

1

. The research reveals that AI-generated pull requests contain an average of 10.83 issues each, compared to just 6.45 issues in human-generated PRs—representing approximately 1.7x more code bugs when AI tools are involved

1

2

. This disparity translates to longer code reviews and increased risk of defects making their way into production systems.

Source: The Register

Source: The Register

The severity of problems also escalates with machine-generated contributions. AI-authored code contains 1.4x more critical issues and 1.7x more major issues on average than human-written PRs, the report indicates

2

. These findings align with separate research from Cortex, which documented that pull requests per author increased 20 percent year-over-year while incidents per pull request jumped 23.5 percent, and change failure rates climbed around 30 percent

1

.

Security Vulnerabilities in AI Code Present Heightened Risks

The CodeRabbit study uncovered particularly troubling patterns in security vulnerabilities introduced by AI-generated code. Machine-authored contributions were 2.74x more likely to add XSS vulnerabilities, 1.91x more likely to create insecure object references, 1.88x more likely to introduce improper password handling, and 1.82x more likely to implement insecure deserialization compared to human developers

1

2

. Overall, AI code showed 1.57x more security findings than human-authored code

2

.

These security concerns arrive at a critical moment for the industry. Microsoft patched 1,139 CVEs in 2025, the second-largest year for CVEs by volume after 2020, according to Trend Micro researcher Dustin Childs

1

. Microsoft CEO Satya Nadella has stated that as much as 30 percent of the company's code is now written by artificial intelligence

2

. Childs speculates that "as Microsoft's portfolio continues to increase and as AI bugs become more prevalent, this number is likely to go higher in 2026"

1

.

Code Quality and Maintainability Suffer Under AI Generation

Beyond security, AI-generated code falls short across major issue categories that affect long-term code quality. The bots created 1.75x more logic errors and correctness issues, 1.64x more code quality and maintainability errors, and 1.42x more performance issues compared to human-authored code comparison

1

2

. These AI code maintainability issues suggest that while AI tools may accelerate initial development, they potentially create technical debt that requires significant human intervention to address.

David Loker, director of AI at CodeRabbit, noted that "AI coding tools dramatically increase output, but they also introduce predictable, measurable weaknesses that organizations must actively mitigate"

1

. The severe defects in AI code manifest as tangible challenges for engineering teams managing code reviews and quality assurance processes.

Mixed Research Findings and AI Impact on Developer Productivity

Not all studies paint the same picture. A University of Naples paper titled "Human-Written vs. AI-Generated Code: A Large-Scale Study of Defects, Vulnerabilities, and Complexity" found that AI-generated Python and Java code "is generally simpler and more repetitive, yet more prone to unused constructs and hardcoded debugging, while human-written code exhibits greater structural complexity and a higher concentration of maintainability issues"

1

. Researchers from Monash University and University of Otago observed that GPT-4 frequently produces more complex code requiring reworking for maintainability, though it achieved higher test case pass rates across various tasks

1

.

The AI impact on developer productivity remains contested. Model Evaluation & Threat Research (METR) reported in July that "AI tooling slowed developers down"

1

. CodeRabbit acknowledges methodological limitations in its study, including uncertainty about whether PRs labeled as human-authored were exclusively created by humans

1

. AI did outperform in limited areas—spelling errors were 1.76x more common in human PRs, and human-authored code had 1.32x more testability issues than AI-generated alternatives

1

2

. As organizations integrate AI coding assistants at scale, teams must balance velocity gains against the demonstrable increase in defect volume and severity that requires enhanced review processes.

Today's Top Stories

TheOutpost.ai

Your Daily Dose of Curated AI News

Don’t drown in AI news. We cut through the noise - filtering, ranking and summarizing the most important AI news, breakthroughs and research daily. Spend less time searching for the latest in AI and get straight to action.

© 2025 Triveous Technologies Private Limited
Instagram logo
LinkedIn logo