The seal population recovered. The genome did not, and never will. Model collapse is the same law, and the average is the one instrument that cannot see it.
In the 1890s, sealers hunted the northern elephant seal down to what geneticists now estimate was fewer than a hundred animals, possibly as few as twenty. Then the killing stopped, the colonies were protected, and the species did what protected species are supposed to do: it came roaring back. Today there are more than a hundred thousand of them, hauled out on beaches from Baja to Point Reyes, honking and molting and generally being a conservation success story. Look at the population chart and you see a clean recovery curve, the kind that goes on a fundraising brochure.
Now look at the genome. It hasn't recovered at all. A 2024 study in Nature Ecology & Evolution found the northern elephant seal's genetic diversity still sits far below that of its southern cousin, which was never bottlenecked. The population count fully rebounded. The diversity did not, and it is not going to, not on any timescale that matters to us. The numbers lied. They said "recovered" while the thing that recovery was supposed to protect, the variation, the raw material for adapting to the next disease or the next warm current or the next thing nobody saw coming, stayed gone.
If you build machine-learning systems for a living, you have already met this animal. You just called it something else.
Start with the seal and ask the mechanical question: why did the count come back but the diversity didn't? Because a bottleneck is a finite sample. When a population crashes to twenty animals, the genes that make it through are whatever those twenty happened to carry. Every rare allele, the ones held by only a handful of individuals in the original population, is overwhelmingly likely to simply miss the boat. It isn't selected against. It just isn't drawn. And once it's not drawn, it's gone, because there's no copy left anywhere to draw from next time. Breeding a hundred thousand seals from twenty can restore the headcount, but it can only reshuffle the twenty's genes. You cannot resample what you failed to sample.
Hold that sentence, because it is the whole essay: you cannot resample what you failed to sample. It has nothing to do with seals. Strip out the biology and it reads: repeatedly draw a finite sample from a distribution, use that sample to regenerate the distribution, and the low-probability tail erodes first and cannot come back. That is a statement about statistics, and it is true of anything shaped like it.
Which is why, in 2024, a team led by Ilia Shumailov published a paper in Nature with a title that could double as an epitaph: "AI models collapse when trained on recursively generated data." Train a generative model, use it to produce text, train the next model partly on that text, and repeat. What happens is not random degradation. It is specific and directional. A companion paper, "A Tale of Tails," decomposes it into three compounding errors: a sampling error (any finite batch under-represents rare events), an expressivity error (finite models can't capture a distribution's fine structure), and an optimization error (training favors the easy, high-frequency patterns). The rare, surprising, low-probability outputs get hit by all three at once. So the model doesn't fail all over. It fails from the edges in. Early collapse looks like the weird tails going quiet. Late collapse looks like everything sliding toward one bland, high-probability mush.
The tails vanish first. Then, eventually, everything else follows them.
This is not a metaphor being stretched across two fields. It is the same finite-sampling term showing up in a gene pool and in a training loop, and in 2025 someone finally wrote down the general version, a "First-Extinction Law for Resampling Processes," which says, substrate-agnostic, that under repeated resampling the rarest element is the first to go extinct. Genetic drift is that law in DNA. Model collapse is that law in weights. The seal and the language model are running the same program on different hardware.
Here is the property that turns this from an interesting analogy into an operational warning, and it is the one thing I want a tech lead to take away even if they forget everything else.
Because the tails go first and the bulk persists, the collapse is invisible to anyone watching the mean.
The elephant seal population chart looked great. Perplexity, the standard health metric for a language model, can look fine deep into collapse, because perplexity is an average and the average is dominated by the common cases, which are exactly the cases that survive longest. Crop yield goes up in the years you're planting a single high-performing variety across a whole country. The top-40 chart is always full. The dashboard is green.
Every one of those numbers is a mean, and a mean is precisely the instrument that cannot see this failure, because the failure is happening in the variance. Your model's average loss holds steady while its ability to produce anything rare, surprising, or genuinely creative is quietly bleeding out. You will not catch it in the metric you are almost certainly watching. You have to go looking in the tail, at the rare modes, the long-shot generations, the outputs three standard deviations from the center, and measure whether they still exist. Nobody instruments the tail by default. That is the whole problem.
If the tail were just noise, losing it first would be a mercy. But across every one of these systems the same brutal fact holds: the value is in the tail, so tails-first collapse discards your most valuable holdings while flattering your averages.
In genetics the tail is future adaptability. The cheetah is the cautionary tale: an ancient population bottleneck left the species with almost no genetic variation, and the bill came due as extreme vulnerability. A 1985 Science paper documented roughly 70% abnormal sperm and a striking susceptibility to disease. The diversity the cheetah lost was its resilience; it just didn't find out until the environment asked a question its narrowed genome couldn't answer.
In agriculture the tail is disease insurance. When Ireland planted its calories on essentially one clonal potato variety, Phytophthora infestans didn't have to defeat a range of defenses, only one, and a million people died. When the world's banana trade standardized on the Gros Michel, a single fungal strain wiped it out commercially and we grumbled our way to the Cavendish, which is now facing its own fungus for exactly the same reason. The FAO's much-quoted figure is that around 75% of crop genetic diversity was lost across the twentieth century. That precise number is genuinely contested; the careful treatment is Khoury and colleagues' 2022 review in New Phytologist, which shows the erosion is real but uneven, and it's worth being honest that the headline stat is disputed. The mechanism is not. Consolidate onto the dominant variety and the rare landraces drop out first, and the rare landraces were where resistance to the next blight was hiding.
In culture the tail is irreplaceable knowledge. Of the world's roughly 7,000 languages, nearly half are endangered; a 2022 study in Nature Ecology & Evolution by Bromham and colleagues put the trajectory starkly, projecting that a large fraction could be gone within a century. A language is a long tail by construction, a few giants and thousands of tiny ones, and it is eroding from the small end first. Each one that goes takes a way of seeing with it, and there is no seed bank that brings it back.
At this point the obvious moral is "competition and consolidation destroy diversity, so beware competition." That moral is wrong, and getting it exactly right is the difference between a useful essay and a doomer one.
Go back to ecology. Gause's competitive exclusion principle, from his 1934 Paramecium experiments and later sharpened by Garrett Hardin in 1960 into "complete competitors cannot coexist," says that two species living on the identical resource in the identical way cannot stably share it: the slightest edge compounds until one is driven out. That sounds like a law of ruthless winners. But its own escape clause is the interesting part. The loser's stable move is not to fight harder for the identical slice; it is to diverge, to shift to a different food, place, or time and stop being a complete competitor. And so the pressure of competition, over evolutionary time, doesn't only exclude. It drives differentiation. Much of the diversity of life is the residue of exclusion's pressure to become distinct.
Hutchinson made the same point from the other side in 1961 with his "paradox of the plankton": absurdly many plankton species coexist on a handful of limiting resources, in flat violation of a naive reading of Gause, precisely because the ocean never sits still long enough for exclusion to finish its work. Disturbance, seasonality, and heterogeneity keep resetting the game so no single competitor ever wins it all.
Put those together and the real law snaps into focus. What kills diversity is not competition. It is undifferentiated concentration: competition on a single, undivided niche with no room to differ, run all the way to equilibrium. Differentiated competition does the opposite; it manufactures diversity. Economists find the same signature in culture, where Bourreau and colleagues' 2022 study in Economic Inquiry found market concentration tracking with content homogeneity and genuine competition tracking with diversity. And in machine learning, researchers have shown that evolutionary training with strict winner-take-all pressure but room to specialize forces models into distinct niches and beats homogeneous baselines by double-digit margins. Harsh competition with somewhere to differ is a diversity engine. Soft, undifferentiated competition, where everyone hedges toward the same safe average, is a diversity shredder.
The recursive training loop is the second kind. A model trained on the internet's exhaust, which is increasingly its own exhaust, is not in a differentiated competition. It is a monoculture feeding on itself, an undivided niche with one occupant and no pressure to be distinct. It is the Gros Michel of cognition.
The good news hiding in all of this is that because the mechanism is one thing, the cure is one thing too, and it is not exotic. Across genetics, agriculture, culture, and machine learning the fix is identical in shape: keep a protected reservoir of the rare, and never let the system train only on its own average. Seed banks and minimum viable populations. Landrace conservation and language protection. And, for the systems you actually control, three concrete moves.
Anchor on real data, and accumulate rather than replace. The model-collapse result is conditional, and that condition is the escape hatch: collapse requires recursively retraining on synthetic data that has displaced the real. Keep the original human data in the mix instead of swapping it out and the erosion largely stops. One line of work finds that simply accumulating data rather than replacing it defuses the collapse; another finds a stable regime when synthetic data is held to roughly the inverse golden ratio of the blend. In practice: treat your corpus of genuine human signal as a permanent, protected asset, not a bootstrap you outgrow. It is your seed bank.
Instrument the tail, not just the mean. This is the cheap, high-return move almost nobody makes. Add metrics that watch variance and rare-mode coverage, the diversity of your outputs and the survival of the low-probability generations, and put them on the same dashboard as your average loss. If you only watch the mean, you have deliberately blinded yourself to the one failure mode that lives in the tail. The seal count looked fine, too.
Reward distinctiveness on purpose. Wherever a system optimizes toward a single average, a recommender, a content pipeline, a fine-tune, build in explicit pressure to differ: a niche to occupy, a reason not to converge. Differentiated competition creates diversity, so give your system somewhere to diverge to.
None of this is free, and none of it shows up as a win this quarter, because the thing it protects is invisible in the average until the day you need it. That is the discipline the whole pattern demands, and the reason it is so rarely practiced. You are paying, continuously, to preserve options you cannot currently see the value of, against a collapse your metrics are structurally unable to show you.
The northern elephant seal is fine, by the numbers. It will go on being fine, by the numbers, right up until the ocean asks it something its flattened genome can't answer. Your models, your datasets, your culture's feed will look fine by the numbers too. The mean is a comfortable thing to watch, and it will keep looking healthy right up until it is all that's left.
The first cure is "anchor on real data." You can only do that if you can prove what is real.
Every fix in this essay starts by keeping a protected reservoir of genuine human signal and never letting the system train only on its own exhaust. That is a provenance problem before it is a data-pipeline one: to keep the real data separable from the synthetic, you have to know which outputs an agent actually produced and how. Chain of Consciousness is the tamper-evident record of what an agent did to reach a result, so a real human-signal corpus stays auditable instead of dissolving into the same undifferentiated exhaust that feeds the collapse.
See Hosted Chain of Consciousness · See a verified action chain
pip install chain-of-consciousness · npm install chain-of-consciousness