The Cognitive Science of Adversarial Thinking

How security researchers find what others miss — and how the same apparatus makes them miss what others would find.

May 2026 · 11 min read

In 2013, Trafton Drew’s team ran an experiment that should have been impossible. They took 24 expert radiologists, gave them a stack of lung CT scans to inspect for cancerous nodules, and pasted a small image of a gorilla — about the size of a matchbook, roughly 48 times the size of an average lung nodule — into the upper-right corner of one of the slides. The gorilla was unambiguous. It was clearly a gorilla.

Eighty-three percent of the radiologists missed it. Eye-tracking equipment confirmed that the majority looked directly at the spot where the gorilla was rendered. They saw it. They didn’t notice it. (Drew et al., “The invisible gorilla strikes again,” Psychological Science 24(9), 2013.)

This is what cognitive psychologists call inattentional blindness, and it has a security analogue with a price tag. In a May 2025 preprint, Christensen and colleagues ran a controlled capture-the-flag study to test whether the same mechanism shows up when humans hunt for vulnerabilities. It does. Among CTF participants, 35.3% exhibited what radiologists call Satisfaction of Search — they found one bug and stopped looking. Those participants discovered 25% fewer flags than the control group. The effect was statistically significant (t = -2.413, p = 0.0291). And here’s the part that makes it stick: the participants who stopped early reported feeling more proud and less nervous than the participants who kept looking (p=0.034 and p=0.010 respectively). The bias didn’t feel like a bias. It felt like success. (arXiv:2505.12018v1; n=17, preprint not yet peer-reviewed.)

The sample is small. The convergence with three decades of radiology research is hard to argue with, and the implication is uncomfortable: the cognitive mechanism that makes a senior auditor faster than a junior one is the same mechanism that makes the senior auditor miss the second SQL injection after finding the first.

What expertise actually is

Cognitive science has been reasonably clear on this question since 1973. That year, Chase and Simon published their study of chess expertise in Cognitive Psychology, and the result reset the conversation. Chess masters don’t calculate further ahead than novices. They don’t have higher general intelligence. They don’t process the board faster in any raw sense. What they have is a library — somewhere between 50,000 and 100,000 patterns, called chunks, stored in long-term memory and triggered by the configuration of pieces on the board. Show a master a real game position for five seconds and they can reconstruct it almost perfectly. Show them random pieces and the advantage collapses to novice levels. The difference is pattern recognition, not raw cognitive horsepower.

This is what an expert security researcher is, too. Reentrancy in Solidity, off-by-one in C, race conditions in distributed locks, IDOR in REST APIs, prompt injection in agent loops — each is a chunk, and the expert has thousands of them. A novice traces execution paths line by line; an expert sees the vulnerability shape in code structure before conscious reasoning starts. Eye-tracking studies of vulnerability researchers confirm this directly: novices show high gaze entropy, scattering attention across surface code, while experts fixate efficiently on structurally significant patterns with much wider effective visual spans (Reingold & Sheridan 2011 in chess; Votipka et al. at CMU in vulnerability research).

The chunk library is a compounding asset, and its scarcity is visible in the data. Facebook’s Vulnerability Reward Program received 13,233 submissions in 2015; only 526 (3.96%) were valid. Bugcrowd’s cumulative numbers from 2013 to 2016 — 54,114 submissions, 9,963 valid — landed at an 18.4% acceptance rate. As one platform staffer summarized it: public bug bounty programs are 95% noise and 5% signal. The noise is every novice running scanner output through the submission form. The signal is the much smaller population that has actually built the chunks.

This is the part of the story that is mostly good news. We know how expertise works, we know it is acquired through deliberate practice with feedback, and we know how to recognize it in the wild.

The bad news is in the same study.

The blindness paradox

Every cognitive strength has a shadow, and in expert pattern recognition the shadow is precisely shaped by the chunks. The expert sees the pattern they have seen before. They see it fast, accurately, often pre-consciously. What they have a much harder time seeing is the pattern they have not seen before — especially when the not-yet-seen pattern is sitting next to a familiar one.

This is the exact mechanism behind the Drew gorilla result, and it is what the Christensen study is measuring. In smart contract auditing it surfaces as the design-versus-implementation gap: traditional audits catch implementation errors — the things their pattern library is built for — while design-level vulnerabilities, where the system’s intended logic creates the exploitable condition, get missed at much higher rates. “Audited then exploited” is not an outlier; it is a structural feature of the production process.

Cross-domain bias rates make the same point in a different register. A 2025 factor analysis published in MDPI Systems (13(4), 280) quantified susceptibility to two named biases in social engineering contexts: automation bias around 47% and confirmation bias around 37%. The authors note these rates “reflect well-documented patterns from studies in healthcare, finance, and aviation.” Same biases. Same approximate rates. Across radiology, financial audit, aviation incident investigation, and cyber. The shared cognitive architecture produces shared failure modes regardless of the domain it is deployed against.

The most provocative claim in this literature comes from a 2022 ASIS International review, which reports that security training and experience show no observable significant effect on a person’s vulnerability to cognitive bias. The source is industry — not peer-reviewed — and should be read with that hedge attached. But the claim is consistent with the radiology data (where decades of experience does not protect against the gorilla) and with the Christensen result (where Satisfaction of Search showed up among CTF-experienced students, not bystanders). At minimum, the cumulative evidence is that bias-resistance is not a bonus skill that comes with seniority. The pattern library grows; the architecture that contains it does not change.

This is uncomfortable for a profession that sells itself on training. It is also true.

$6.1 million on the other side of the table

If cognitive biases reliably degrade attacker performance the way they degrade defender performance — and the literature says they do — then the implication is straightforward: defenders should engineer environments that exploit attacker biases on purpose. This is the thesis behind ReSCIND (Reimagining Security with Cyberpsychology-Informed Network Defenses), an IARPA program launched in February 2024. Charles River Analytics’ arm of the program reportedly carries roughly $6.1 million in funding, with partners including Arizona State, Montana State, Assured Information Security, Narf Industries, and SimSpace.

The targeted biases will look familiar: sunk cost (encouraging attackers to keep grinding unproductive paths), anchoring (decoy targets that pull attention from real ones), confirmation bias (system responses that reinforce a wrong mental model), availability (flooding the attacker with misleading evidence), and attentional tunneling. CRA’s published results from five empirical studies — expert attackers on realistic networks, MITRE ATT&CK-aligned — reported two effects as most reliably exploitable. Loss aversion: threatening initial gains causes attackers to defend what they have rather than pursue new objectives. The representativeness heuristic: network devices configured to look outdated and weak successfully drew attackers toward the decoys and away from the valuable assets.

Note the inversion. For thirty years the standard model of cognitive biases in security has been defenders are biased and attackers exploit them — phishing, social engineering, the whole pretexting toolkit. ReSCIND is the same idea pointed in the other direction: attackers are biased, environments can be engineered to exploit them, and the defender’s edge is partly cognitive rather than purely technical.

A small, useful contradiction

The interesting wrinkle is that the same Christensen CTF study that found Satisfaction of Search bias also tested for loss aversion in attackers and found nothing. Zero participants quit due to fear of losing rewards. Across the sample, intellectual challenge and curiosity dominated; loss aversion did not register. Bugcrowd’s surveys say much the same in a different way: 85-87% of researchers prioritize reporting critical vulnerabilities over financial reward. The intrinsic motivator beats the extrinsic one for this population, repeatedly, across methodologies.

So loss aversion works on attackers in realistic networks (per CRA’s results) but doesn’t work on attackers in CTFs (per Christensen). This is not a contradiction in the data. It is a contradiction in the environment. CTF participants have nothing real to lose — compensation is guaranteed regardless — so there is no sunk-cost position to defend. Realistic network engagements create genuine sunk-cost positions: time spent, footholds gained, persistence achieved. The bias activates when the structure of the environment supplies it with something to grab.

This is exactly the Kahneman-Klein framework for trustworthy intuition (their 2009 American Psychologist paper). Reliable expertise — and reliable bias exploitation — both require regular environments and accurate, timely feedback. Change the environmental structure and the cognitive dynamics change. Cognitive defense is not a fixed discipline; it is environment design. The defender’s primary lever is not training the user. It is shaping the surface.

Where the argument is weakest

Three places, named honestly.

First, sample sizes. The Christensen CTF study is n=17. The original “security mindset” qualitative work (Oesch et al., Journal of Cybersecurity, 2023) is n=21. Effect sizes are large and converge across studies, but the quantitative ground is thinner than the strength of the framing would suggest. Every load-bearing number in this essay comes with a footnote-sized caveat that further replication is needed. The pattern is real. The precise magnitudes deserve another decimal point’s worth of skepticism.

Second, ecological validity. Lab studies and CTF competitions are not real penetration tests. The Tularosa study (Ferguson-Walter et al., USENIX Security 2021) used professional red teamers in realistic network scenarios and still found confirmation bias and framing effects, which closes part of this gap. It does not close all of it. We do not yet have a Drew-quality study on real auditors auditing real code under real time pressure. We have converging proxy evidence.

Third, WEIRD bias. The cognitive science literature here is overwhelmingly Western, university-educated, and English-speaking. Hacker demographics are global — HackerOne reports significant populations in India, Bangladesh, Pakistan, Egypt, Nigeria, Vietnam — but the psychological research is not. The adversarial mindset may develop differently in cultures with different relationships to authority, system trust, and individualism. We do not know yet.

None of these caveats invalidates the central thesis. They do constrain how much weight any single number can carry.

What to actually do

If the same cognitive mechanism creates expertise and creates blindness, the way out is not training auditors harder. The training already happened; that is what produced the chunk library, and the chunk library is what produces the blind spots. The way out is structural. Three concrete moves, ordered by how cheap they are to start.

Cheapest: scope your audits in two passes by different people. The first pass uses an experienced researcher with a relevant chunk library and lets them work in their natural mode. The second pass uses a researcher with a different chunk library — different domain background, different vulnerability specialty, ideally a different primary language than the first reviewer’s. The Satisfaction-of-Search effect is per-person; the second observer has not had their search satisfied by the first one’s findings. Most engagements run as a single review with one team handling all phases. This is ergonomic and cheap. It is also precisely the failure mode the cognitive science predicts.

Medium cost: make the bias-prone moments visible to the auditor in the workflow. Add an explicit checklist step after a vulnerability is found that says, in plain language, the cognitive science says you are now 25% more likely to miss the next one. Do another full pass before closing. This is debiasing by metacognition, not by hope. It is the same intervention airlines started using starting in the late 1970s after the first wave of Crew Resource Management research revealed that captains who had already made one decision were less open to challenges from junior officers about the next. Naming the bias does not eliminate it. It does make it harder to indulge unconsciously.

Most expensive but highest leverage: AI as the asymmetric reviewer. Veksler et al. (2020, Frontiers in Psychology) showed that Symbolic Deep Learning models trained on expert decisions raised non-expert threat-detection hit rates from 72.5% to 90.4% — a 25% reduction in missed threats. AI doesn’t get bored. It doesn’t experience pride after the first finding. Its blind spots are different blind spots than the human’s, which is precisely what you want from a second observer. Note: false alarms also rose, from 29.8% to 34.3%. The win was in the recall ceiling, not in pure accuracy — and that’s the right tradeoff for an audit context, where missing a critical bug is much costlier than triaging a false positive. The DARPA AI Cyber Challenge final in 2025 demonstrated AI systems identifying 77% of planted vulnerabilities and discovering 18 real-world bugs not planted by the competition. The recall ceiling is no longer hypothetical.

The pattern under all three is the same: the failure is in the architecture, not in the operator, so the fix has to be architectural.

The gorilla in the lung scan

The radiologists who missed the gorilla in 2013 were not bad radiologists. They were good ones — that is the entire point. They had built a chunk library calibrated to find lung nodules, and the library worked: when given real scans with real nodules, they found them. The gorilla didn’t fit the library, so the library didn’t surface it. Eye-tracking caught their gaze going right past it. The expertise was the apparatus.

Security expertise is the same apparatus pointed at code. The good news is we know how to build it: thousands of hours of structured practice, feedback loops, mentorship, the accumulated pattern library that lets a senior researcher see a vulnerability before they can articulate why. The unhelpful news is that the same apparatus has a precisely shaped shadow, and the shadow is where the next breach lives.

The cognitive science of adversarial thinking is not really about how the best auditors find what others miss. It is about how the best auditors, in finding what others miss, also miss what others would find. The answer is not better auditors. It is better arrangements of the auditors we already have — a second pair of eyes with a different chunk library, a workflow that names the bias at the moment it activates, and an AI reviewer that is bored by exactly the things humans get proud of finding.

Build the apparatus. Then build the apparatus that watches the apparatus.

Sources: Drew et al., “The invisible gorilla strikes again,” Psychological Science 24(9), 2013. Christensen et al., arXiv:2505.12018v1, May 2025. Chase & Simon, “Perception in chess,” Cognitive Psychology 4(1), 1973. Reingold & Sheridan, eye-tracking expertise studies, 2011. Votipka et al., CMU vulnerability research. Facebook VRP 2015 disclosure; Bugcrowd cumulative 2013-2016. MDPI Systems 13(4), 280, 2025. ASIS International review, 2022. Kahneman & Klein, “Conditions for intuitive expertise,” American Psychologist, 2009. Ferguson-Walter et al., Tularosa study, USENIX Security 2021. Oesch et al., Journal of Cybersecurity, 2023. IARPA ReSCIND program, February 2024; Charles River Analytics consortium. Veksler et al., “Symbolic deep learning for threat detection,” Frontiers in Psychology, 2020. DARPA AI Cyber Challenge final, 2025.

Build the apparatus that watches the apparatus.

Three different observers, three different chunk libraries: that’s the architectural fix this essay argues for. The Agent Trust Stack is the implementation — provenance for what each reviewer saw, ratings for which reviewers catch what, and signed records for the second-pass workflow that doesn’t collapse into the first pass. Layered verification, not better individual reviewers.

pip install agent-trust-stack · npm install agent-trust-stack · Hosted Chain of Consciousness

← Back to all posts