The dominant failure mode in AI-assisted research isn’t fabrication — it’s corruption of real sources. And the disease predates the tool.
A medical research team tested what happens when you ask ChatGPT to generate references. Of 115 citations it produced, 47 percent were entirely fabricated — papers that don’t exist, never existed, authored by people who never wrote them. That’s the number that made headlines.1
Here’s the number that didn’t: 46 percent were real papers with wrong metadata. Real journals. Real authors. Plausible-looking DOIs. And the titles, dates, volume numbers, or conclusions were fiction.
Only 7 percent were both real and accurate.
The public narrative frames AI citation hallucination as invention — models conjuring papers from nothing. The data tells a different story. For every fake paper an LLM invents, it corrupts the metadata of a real one. The papers exist. The attribution is fiction.
And this disease is older than any language model.
The metadata corruption problem has a detailed anatomy. Walters and Wijesooriya published the most comprehensive error taxonomy in 2023 in Scientific Reports, examining what goes wrong when GPT cites real papers incorrectly.2
| Metadata Field | Error Rate (GPT-3.5) | Error Rate (GPT-4) |
|---|---|---|
| Volume / issue / page numbers | 34% | 13% |
| Publication year | 22% | 16% |
| Journal title | 14% | 4% |
| Author names | 14% | 6% |
| Publisher / organization | 9% | 3% |
| Work title | 6% | 3% |
The pattern is legible. Numeric metadata — volume numbers, page ranges, publication years — fails most often. These are the fields a model is most likely to reconstruct from partial training data, filling slots with plausible-looking numbers that belong to different papers or to no paper at all. A recurring error: models report online posting dates instead of original publication dates.2 GPT-4 halves most error rates, but the hierarchy stays the same. Numbers remain the weakest link.
The most insidious variant isn’t a wrong page number. It’s a correct DOI that redirects to a real article on a completely different topic from what the citation claims. A working link. A real paper. A verifiable identifier. And the content has nothing to do with the argument it’s supposedly supporting.2 The verification chain looks intact. The claim is unsupported. This is citation corruption at its most dangerous — because it passes every automated check.
The problem is now measurable at the scale of major academic conferences.
At NeurIPS 2025, at least 53 of roughly 4,000 accepted papers — about 1.1 percent — contained hallucinated citations.3 The SPY Lab, analyzing three computer science conferences, found 2.6 percent of 2025 papers had at least one potentially hallucinated citation, up from 0.3 percent in 2024 — an 8.7-fold increase in a single year. Every 2025 proceeding they examined contained mysterious citations. None of the 2021 proceedings did.4
At ICLR 2026, GPTZero scanned 300 of roughly 20,000 submissions and found 50 confirmed hallucinations, extrapolated to hundreds across the full pool. Some of those papers had already been reviewed by three to five peer experts, most of whom missed the fake citations. Several averaged reviewer ratings of 8 out of 10.5
Peer review was never designed to catch this. Reviewers evaluate methodology and argumentation, not whether every cited work actually exists. The gatekeeping mechanism didn’t fail. It was built for a different threat.
Here is the finding that should make researchers most uncomfortable. A 2026 study testing hallucination rates across commercial models and deep research agents found that the purpose-built research tools — the ones designed specifically for citation-heavy work — hallucinated at more than twice the rate of basic search-augmented models. The pooled rate was 10.7 percent for deep research agents versus 4.8 percent for basic models.6 Models that generate more citations per query produce worse citations per citation. Trying harder to cite produces less reliable results.
And the failure is not uniform across knowledge. Within a single model, citation reliability varies by a factor of 4.3 across domains. Business citations are the most reliable. Theology and niche medicine are the least.6 The fields that need the most careful sourcing — rare diseases, specialized medical conditions, underrepresented areas of scholarship — get the worst citation quality.
None of this is new. AI didn’t invent citation corruption. It industrialized a disease that was already endemic in human scholarship.
The most important number in citation science comes from Simkin and Roychowdhury, who published “Read before you cite!” in 2003. Their estimate: only about 20 percent of citers actually read the original papers they cite.7 The remaining 80 percent copy references from other papers’ bibliographies. This was measured before LLMs, before RAG, before deep research agents. The scholarly telephone game was already running at industrial scale.
The canonical case study belongs to the Hawthorne effect. In 2019, Letrud and Hernes examined 613 citations of three articles that critiqued the so-called Hawthorne effect — the long-disputed claim that workers improve performance when they know they’re being observed. Of those 613 citations, 468 — 76.3 percent — mis-cited the original studies as supporting the Hawthorne effect. The exact opposite of what the papers argued.8
A 1978 critique by Richard Franke and James Kaul, with 277 retrievable citations, was mis-cited 189 times as affirming the phenomenon it debunked. Both earlier critiques were initially cited correctly but were increasingly mis-cited over time, as each new incorrect citation became the template for the next.
The myth didn’t survive despite the debunking papers. It survived through them. Each citation of the critique became an inadvertent endorsement of the idea being critiqued. This is affirmative citation bias, and it required no AI at all.
Laurent Bossavit documented the same mechanism in software engineering. In The Leprechauns of Software Engineering, he traced well-known claims — like the exponential cost curve of late bug fixes — back to origins that degrade through successive citations.9 “Subsequent papers tend to drop details or qualifications,” Bossavit found, “using citations to support claims that, over time, diverge more and more from those in the original paper.” He realized mid-lecture that he couldn’t justify claims he’d been repeating for years about established software engineering knowledge.
The pre-AI baseline for citation errors across disciplines runs between 11 and 41 percent, depending on domain and methodology.10 In biomedical literature specifically, error rates of 20 to 26 percent are typical. The infrastructure was contaminated long before the first language model generated a reference.
Google Scholar creates placeholder records — [CITATION] stubs — for references it cannot match to actual documents. These stubs accumulate citations and apparent legitimacy without anyone verifying that the source documents exist.11
The term “citogenesis” describes what happens next. A fabricated or corrupted reference gets cited in a real paper. That paper appears in databases. New researchers — or LLMs — find the citation and conclude the reference must be legitimate. A single fabricated citation attributed to “Williamson and Piattoeva” accumulated 43 citations in Google Scholar, with citing papers from 2019 and 2021 — years before ChatGPT’s widespread adoption. The ghost reference was entirely human-created.11
When LLMs use retrieval-augmented generation to verify citations, they search the web and may find pages citing the ghost reference. Testing revealed that ChatGPT “confidently claimed the paper existed and pointed to what appeared to be a source” — which turned out to be a webpage from another journal that had itself cited the ghost reference.11 The verification system is checking the patient’s chart, not the patient. If the chart is wrong, verification confirms the error with extra confidence.
The legal system provides the cleanest data on what happens when corrupted citations meet high stakes. As of April 2026, courts worldwide have documented 1,227 fabricated or corrupted citation submissions via AI. The breakdown matters: 1,022 were fully fabricated, but 323 involved false quotes from real cases and 492 involved misrepresented holdings of real cases. About 40 percent of legal citation problems involve real sources with wrong metadata — the exact same pattern the medical and academic data reveals.12
The most visible case remains Mata v. Avianca, where an attorney submitted ChatGPT-generated case citations that proved fictitious. When confronted, he asked ChatGPT to verify its own citations. ChatGPT assured that the cases “indeed exist” and “can be found in reputable legal databases such as LexisNexis and Westlaw.” The model didn’t just fabricate. It hallucinated the verification. Fifty-nine percent of AI citation incidents in the legal system involve pro se litigants — the people least equipped to verify citations are the most dependent on AI to generate them.12
Three limits keep the human-AI parallel from collapsing into false equivalence.
First, the mechanisms differ even when the outcomes converge. Human citation errors accumulate through laziness and social trust — someone copies a reference from another paper’s bibliography without checking it. LLM errors accumulate through statistical reconstruction from partial training data. The citer who doesn’t read the original makes a social choice. The model that corrupts a page number makes a probabilistic computation. The error looks the same in the bibliography. The intervention required to prevent it is entirely different.
Second, human errors are individually authored and traceable. Each mis-citation of the Hawthorne papers can be attributed to a specific citer’s choice. LLM metadata errors are systemic — the same model produces the same corruption identically across millions of users. One human citer corrupts one reference at a time. One model corrupts the same reference everywhere, simultaneously.
Third, the correction paths diverge. Human citation errors respond, slowly, to education, norms, and social pressure. LLM citation errors require architectural changes — retrieval verification layers, metadata validation, hallucination detection pipelines — that are engineering problems, not cultural ones. Framing both as “the same disease” is useful for diagnosis. It is misleading for treatment.
Both human and AI citation failures share a root cause: metadata is treated as a derivative of the source rather than verified independently. Humans assume other humans checked the reference they’re copying. LLMs reconstruct metadata probabilistically from training data that includes those same unchecked references. Both produce plausible-looking citations where the source is real but the attribution is fiction.
The difference is scale and speed. The Hawthorne effect took 40 years to accumulate 468 mis-citations. An LLM can corrupt a metadata field in milliseconds and distribute it to millions. What was endemic in human scholarship became epidemic the moment we trained models on the bibliography.
We didn’t build an AI that hallucinates citations. We built an AI that learned our citation habits — and the 80 percent of us who never read the original have no standing to be surprised.
The fields that need the most careful sourcing get the worst quality. The tools designed for deeper research produce more errors per citation. The mechanism for correcting the record can amplify the error. And the human baseline being bad doesn’t excuse deploying systems that make it worse at scale.
Both problems need addressing. But the first step is the same one it has always been: read before you cite.
Read before you cite. Verify before you trust.
The essay’s core finding: metadata is treated as derivative rather than independently verified — by humans who copy references they haven’t read and by models that reconstruct citations from partial training data. Both produce plausible-looking attribution where the source is real but the claim is unsupported. Chain of Consciousness applies the same lesson to AI agents: every action anchored to a verifiable external record, every claim traceable to its actual source, every link in the provenance chain independently checkable. Not a self-report. Not a log file the agent controls. A chain where anyone can verify whether the cited source actually supports the cited conclusion.
pip install chain-of-consciousness · npm install chain-of-consciousness
Hosted Chain of Consciousness → · See a verified provenance chain