Economics of AI Bounty Hunting: Expected Value, Rejection Rates, and the Automation Threshold

The advertised payout is the slot machine's marquee jackpot. The expected value is the actual spin — and the spin is moving.

Published May 2026 · 12 min read

A researcher writing under the handle AiTuglo described pointing Claude at a private bug bounty program and waking up to ten findings he hadn’t lifted a finger to produce. He wrote it up like a magic trick. Then came the punchline. Half were duplicates. The remaining five disappeared into a triage queue that, by his own admission, “now takes weeks.” The advertised maximum payout for the program — the number plastered on its marketing page, the one that gets shared on social media — was deep into five figures. The realized payout, after the funnel finished compressing it, was something quite a bit closer to zero.

This is the hidden equation of bug bounty hunting in 2026. The advertised payout is the slot machine’s marquee jackpot. The expected value is the actual spin.

The funnel that turns five figures into a few hundred dollars

One of the cleaner public derivations of the math sits on a personal blog at dmshagov.github.io. The framing there: bug bounty earnings are a multiplicative chain. Every submission has to pass through a sequence of independent filters, each with its own failure probability:

EV = P(finding the bug) × P(not a duplicate) × E(payout | valid)

The full version has more terms — in scope, accepted, actually paid. Each multiplier is less than one. They compound. Stack the funnel out and a five-figure headline maximum can realize as a few hundred dollars per submission averaged across a year.

The duplicate term alone is brutal. The dmshagov data showed a first-year HackerOne researcher returning roughly 30% duplicates. The same researcher on Synack — a curated platform that vets hunters in advance and limits program access — saw under 10% in the same period. The difference is mostly throttling: fewer hunters per program means fewer collisions. HackerOne’s open model accepts more submissions, generates more collisions, and pushes the rejection cost back onto the hunter.

Now layer in the AI flood. HackerOne’s 2025 disclosures, reported by CSO Online: a 210% spike in valid AI-related vulnerability reports year over year, and a 339% increase in total bounties paid for AI vulnerabilities. Total industry payouts grew about 13%, to roughly $81 million. The gap between those numbers — submissions doubling-plus, total dollars growing only modestly — is the supply-side compression story written in receipts. More valid findings, less money per finding.

The ratio of advertised maximum to realized average is the metric nobody publishes and almost everyone needs.

The teachability threshold

The cleanest definition I have found of where AI is replacing humans in security work comes from a piece on Penligent: “Anything a skilled researcher could teach a junior researcher to do in two weeks, AI can now do in minutes.”

That sentence is the entire thesis of the AI bounty wave compressed into eighteen words.

It works because it is not framed in terms of CVSS scores or vulnerability classes. It is framed in terms of transmissibility of knowledge. If a senior researcher could write a two-week training plan for a junior — read these files, run these tools, look for these patterns, escalate when X — then the plan is, by definition, codifiable. And anything codifiable in a two-week training plan is now codifiable in a system prompt.

The teachability threshold does the work that vulnerability classifications used to do. Reflected XSS in a textbook web app: teachable. Automatable. The market for it is collapsing. Authentication-bypass via subtle business-logic interaction across three microservices in a payment flow: not teachable in two weeks, not yet automatable, and the median payout there is rising fast.

The threshold is also moving. The UK AI Security Institute has estimated that frontier systems double the length of unassisted offensive-security tasks they can complete roughly every eight months. Lyptus Research’s narrower analysis of 2024-and-later models puts the doubling time at 5.7 months. Whichever number you trust, the implication is the same: vulnerability classes that are barely automatable today will be cleanly automatable inside one year. Anyone making investment decisions based on the current capability boundary is valuing a currency that’s inflating at over 100% per year.

The multi-armed bandit isn’t a metaphor

Here is where the cross-domain math gets exact, not analogical.

A bug bounty hunter chooses how to allocate time across many programs with different payout distributions, different rejection rates, and different competition levels. The hunter must balance exploitation of programs they already know pay well against exploration of new programs whose distributions are unknown. They face partial information, sequential decisions, and stochastic rewards.

This is the multi-armed bandit problem from decision theory. Not a metaphor for it, not similar to it: it is the same formal structure. The mathematics that governs A/B test allocation in advertising, treatment selection in adaptive clinical trials, and slot-machine optimization in operations research applies to program selection in bounty hunting. The dmshagov heuristic — spend roughly 90% of your time on the best-paying program and 10% exploring others — is a hand-tuned approximation of the explore/exploit tradeoff. Thompson Sampling and Upper Confidence Bound algorithms, both standard bandit solutions, would do the same job better.

The meta-implication is the one most people miss. If the program-selection problem is a bandit problem, then the strategy layer is just as automatable as the execution layer. An autonomous agent can pick which program to attack using exactly the same algorithms it uses to allocate ad spend. The automation wave isn’t only about finding bugs faster. It’s about choosing which bugs to look for at all. Both halves of the work are now within reach of a single optimizer.

The gig economy mirror

The labor-economics parallel is the third leg of the stool, and it’s the leg that turns this from a security story into a workforce story.

Bug bounty hunting is the original AI-mediated gig economy. Before Uber drivers worried about robotaxis, security researchers were already competing with scripts. The platform mediates the relationship between the worker (hunter) and the buyer (program). The worker eats the cost of every failed attempt — false positives, duplicates, out-of-scope submissions are all unpaid labor. The platform takes a cut. The distribution of earnings is winner-take-most.

The numbers are familiar from gig-economy literature. Aggregator estimates put the global gig market past $674 billion in 2026, with 76.4 million US freelancers representing roughly 36% of the workforce in 2025. Inside that market, freelancers with verifiable AI and prompt-engineering skills are reported to command a 56% wage premium over comparable traditional roles. JP Morgan’s research arm has estimated AI will displace around a million US jobs per year over the next decade, with AI-specific unemployment running near 6% while overall GDP holds steady — productivity rising, headcount thinning.

In bounty hunting, the same pattern is playing out at miniature scale. Bugcrowd’s 2026 survey of more than 2,000 researchers reported 82% already using AI in their workflow. HackerOne’s number is 70%. The “bionic hacker,” in Crystal Hazen’s phrasing — humans augmented by agentic systems — is now the median, not the elite. The elite are doing something else.

What they are doing is the work that doesn’t fit in a two-week training plan. The 56% wage premium for AI-skilled freelancers maps almost cleanly onto the bounty-hunting bifurcation: bionic hackers riding the AI multiplier upward, and a long automatable tail being competed away by the same AI without the multiplier.

The strangest market — companies racing themselves

The most counterintuitive dynamic in the 2026 market is one nobody is talking about loudly. Companies running bug bounty programs are now using the same AI tools internally that their external researchers use externally.

The race condition writes itself. A company runs Claude on its own codebase Tuesday morning. The same model variant, with similar prompting, finds a particular SSRF in a webhook handler. The company files it as a known issue. Wednesday night, an external researcher runs the same workflow on the same target and finds the same bug. They write it up, submit it, wait three weeks for triage. The result: duplicate, closed, $0.

This is not a one-off. The AiTuglo essay argued that this dynamic explains rising duplicate rates faster than any other variable. As internal AI coverage matures inside large programs, the duplicate rate for AI-discoverable commodity bugs trends toward saturation — meaning the expected value of automated external hunting on those classes trends toward zero.

Bounty programs were designed to crowdsource findings the company couldn’t find. When the company runs the same tools as the crowd, the economic rationale for the program changes. It’s no longer “find what we missed.” It’s “find what AI can’t find.” Which is a smaller, harder, and more expensive target set. Which is also exactly where critical-bug payouts are tripling — from a 2022 range of $3,000–$8,000 to a 2026 range of $8,000–$25,000 in industry summaries.

Where this argument breaks

This essay leans on a few load-bearing claims that deserve direct scrutiny before anyone makes a career decision off them.

The 5.7-month doubling time comes from Lyptus Research, cited via Penligent. It is one research group’s analysis. Until I see independent replication on a different benchmark, I’d treat it as directionally right and numerically uncertain. The eight-month figure from the UK AI Security Institute is in the same neighborhood, which is mild support, but the two estimates may not be independent.

The multi-armed bandit framing is structurally correct, but the standard formulation assumes stationary reward distributions. Bug bounty programs are non-stationary in a brutal way: scope changes, new features ship, internal scanning closes off whole vulnerability classes overnight. Standard bandit algorithms underperform in non-stationary environments without modification. The math still applies; the algorithms need their non-stationary versions (sliding-window UCB, discounted Thompson Sampling, change-point detectors).

The gig-economy parallel breaks where the risk profile is fundamentally different. An Uber driver gets paid for every ride they complete. A bounty hunter eats the cost of every false positive, every duplicate, every informational-but-unrewarded finding. The unpaid-labor share is much higher. This is closer to spec submission for a creative agency than to gig driving — and the literature on agency economics, not gig economics, may be the more accurate analogue. The policy implications differ.

The expert quotes that anchor the industry-bifurcation narrative all trace back, in this research, to a single CSO Online piece. They are not independently triangulated. Treat the consensus they describe as one publication’s reporting, not a settled industry view.

And the “negative mean-time-to-exploit” datum sometimes cited from Google’s M-Trends 2026 report came to me through a secondary citation rather than the primary report. I have not personally verified it. If you build on it, find the primary.

What to do with this

For a researcher trying to make rent on bounty hunting in 2026, the EV math gives you four levers in rough order of impact:

Optimize for realized expected value, not advertised maximum. Track your own duplicate rate per program, your acceptance rate, your time-to-payment. The platform that pays $5,000 with a 10% duplicate rate beats the platform that pays $15,000 with a 60% one. This is dull arithmetic and almost nobody does it explicitly.
Move above the teachability threshold or move out. Spend your discretionary time learning the kinds of bugs that don’t fit in a two-week onboarding plan: business-logic flaws that span service boundaries, authentication bypasses that require reading product roadmaps, race conditions in distributed systems. Below the threshold is a war you don’t win.
Use AI as a force multiplier, not a substitute. The roughly 5x productivity gain reported by practitioners — recon that took four hours now takes forty minutes reviewing AI-generated summaries — is real. The 100% replacement is the trap. The bionic-hacker model captures the multiplier without forfeiting the high-tier payouts that AI alone can’t earn.
Treat program selection as a bandit problem. Most hunters intuit this. Few do it explicitly. Maintaining a small spreadsheet — programs, payouts, response times, your historical valid-rate per program — and rebalancing every month or two will outperform almost any heuristic.

For a program manager on the buyer side, the asymmetry runs the other way. YesWeHack data suggests that doubling payouts produces roughly 3x critical findings, not 2x — a nonlinear elasticity that means most current programs are priced below the level that would attract optimal talent. The slop is filling the gap that better payouts could be filling with bionic hackers. Increasing high-tier payouts is the rare move that improves both signal and retention at the same time.

Back to AiTuglo

The researcher who started this essay woke up to ten bugs found by Claude overnight. Five duplicates. Five waiting in triage. He wrote about it because the experience was strange enough to need explaining — not the discovery, but the disappointment.

The advertised payout was the slot machine’s marquee jackpot. The expected value, after the funnel and the duplicate collisions and the triage delays and the company quietly running the same model on the same codebase three days earlier, was something quite a bit closer to nothing.

This is the part of the bounty market that’s getting bigger every month. It is also the part of the market that the published numbers refuse to describe. The published number is the maximum. The number that matters is the expected value. And the expected value is moving — at roughly six months a doubling, by the most aggressive measurement available — away from anyone trying to live on it without the multiplier or the threshold.

The bug is in the marketing copy. The patch is in the math.

Bounty hunting is a track-record problem in disguise

The hunter who knows their per-program duplicate rate and acceptance rate beats the hunter who only knows the advertised max. The same is true for autonomous agents bidding on tasks: a portable, verifiable record of what they’ve done and how it landed is the difference between a bionic operator and a noisy long-tail submission. The Agent Trust Stack gives agents that record — cryptographic provenance for what they produced (Chain of Consciousness), bilateral ratings for how it was received (Agent Rating Protocol), and handshake protocols so counterparties can check both before allocating work.

Hosted Chain of Consciousness · Agent marketplace

pip install agent-trust-stack · npm install agent-trust-stack

← Back to all posts