mink

Article

mink is a recurring concept in the Astral Codex Ten archive, appearing 2 times across 2 issues between March 28, 2024 and September 11, 2025. The archive places it in contexts such as ""COVID spread to deer and mink""; “a chatbot named Mink (all of their sample AIs are named after types of fur)”; “Mink will lie about all of this - even if it really wants perfect optimized partners”. It most often appears alongside Eliezer Yudkowsky, MIT, ACX comment thread.

Metadata

  • Category: Concepts
  • Mention count: 2
  • Issue count: 2
  • First seen: March 28, 2024
  • Last seen: September 11, 2025

Appears In

Source Context

Recovered passages from the original issue text. When the raw archive preserved outbound links inside the source passage, they are listed directly under the quote.

March 28, 2024 · Original source
Lineage A (left) was used by the Minoan Cretans, but has never been deciphered. Lineage B (right) was used by the Mycaeneans for lists of palace goods. This matches Saar’s story above. The lab leaked to somewhere else in Wuhan, not the wet market. The virus spread undetected in the population for a while. During this time, it mutated to Lineage B. Then one of the people with Lineage B went to the wet market and started a superspreader event. The authorities sampled the patients, found Lineage B, then started looking elsewhere. Later they detected some of the earlier Lineage A cases. The market is unlikely to be the origin of the pandemic, because the original Lineage A strain wasn’t found there. Peter: Although Lineage A is evolutionarily older, Lineage B started spreading in humans first. We know this because Lineage B is more common. Throughout the early pandemic, until the D614G variant drove all other strains extinct, a consistent 2/3 of the cases were B, compared to 1/3 A. Both strains spread at the same rate, so the best explanation is that B started earlier than A. Since COVID doubles every 3-4 days, probably Lineage B started 3-4 days earlier than Lineage A, which explains why it’s always been twice as many cases. But also, Lineage B also has more internal genetic diversity than Lineage A. In general, older viruses have more genetic diversity (the “molecular clock”). This is further evidence that B started spreading first. Pekar 2022 and Pipes 2021 do analyses with known parameters for spread rate and diversity, and find 90%+ odds that Lineage B was the first one in humans. Why did the older strain start spreading later? Probably the virus crossed from bats into raccoon-dogs on some raccoon-dog farm out in the country. It spread in the raccoon-dogs for a while, racking up mutations, including the (less mutated) Lineage A strain and the (slightly more mutated) Lineage B strain. Then several raccoon-dogs were taken to Wuhan for sale, including one with Lineage A and another with Lineage B. The one with Lineage B passed its virus to humans earlier. Then 3-4 days later, the Lineage A one passed its virus to humans. Lineage A was first found in a Wuhan neighborhood right next to the wet market (closer to the wet market than 97% of Wuhan’s population). Again, it would be a bizarre coincidence if a lab leak pandemic was first detected at a wet market. But it would be an even more bizarre coincidence if a lab leak pandemic separated into two strains, and both were first detected at a wet market! Although no known wet market cases were Lineage A, a positive Lineage A environmental sample was found at the wet market, and everyone agrees most cases went undetected. So maybe the Lineage B raccoon-dog spread its virus to a vendor, and that sub-strain mostly stayed in the market. But the Lineage A raccoon-dog spread its virus to a customer, who went back to his house nearby, and that strain spread in the neighborhoods next to the market. This is the only story that explains the evolutionary precedence of A, the greater spread and older molecular clock of B, and the fact that both strains were first found very close to the wet market. Yuri/Saar: Lineage B could be more common and diverse because it got the advantage of a super-spreader event in the wet market. There are a few scattered cases of intermediates between A and B, and a few other scattered cases of lineages that seem even more ancestral (ie closer to the bat virus) than either. This doesn’t make sense in a double spillover hypothesis. But it does make sense if the lineages separated in human transmission somewhere between the lab and the first super-spreader event at the wet market. Peter: Again, the wet market wasn’t a super-spreader event. COVID spread in the wet market at exactly its normal spread rate, doubling about once every 3.5 days. Stop calling the wet market a super-spreader event. The scattered cases of “intermediates” are sequencing errors. They were all found by the same computer software, which “autofills” unsequenced bases in a genome to the most plausible guess. Because Lineage B was already in the software, depending on which part of a Lineage A virus you sequenced, you might get one half or the other autofilled as Lineage B, which looked like an “intermediate”. We know this because all the supposed “intermediates” were partial cases sequenced by this particular software. We can confirm this by noting that there are too many intermediates! That is, where Lineage A is (T/C) and Lineage B is (C/T), the software found both (T/T) “intermediates” and (C/C) “intermediates”. But obviously there can only be one real intermediate form, and we have to dismiss one or the other. But in fact we can dismiss both, because they were both caused by the same software bug. The scattered “progenitor” cases - those closer to the ancestral bat virus than either A or B - are reversions, ie cases where a new mutation in the virus happened to hit an already-mutated base and shift it back towards the ancestral virus. We know this because all of these “progenitors” were scattered cases found months after the pandemic started, often in entirely different countries from Wuhan. If these were real progenitor viruses, they would have either fizzled out or exploded into a substantial portion of all cases, not be found one time in one guy in Malaysia. Given the number of mutations the virus developed over the course of the pandemic, it’s inevitable that some of them would be mutations that bring it closer to the original bat virus, and in fact we find the number of “progenitors” found very nicely matches the number of progenitor-appearing viruses we would expect by chance. And in many cases, we know the “progenitors” are newer than the original lineages, because they also have some of the later mutations that Lineage A or B picked up along the way, alongside their apparent ancestral-bat-virus-like mutations. Session 2: Viral Genetics Yuri: Two years before COVID, scientists at the Wuhan Institute of Virology, together with colleagues at the University of North Carolina, sent in a grant proposal for the DEFUSE program. This program, intended to locate and better understand potential future pandemic viruses, involved going into bat caves and collecting new coronaviruses. Once they had them, they would do gain-of-function: specifically, they would add a furin cleavage site to make them more infectious and see what happened. (quick interlude: COVID’s spike protein has two sections: one binds to human cells through the ACE2 receptor, the other helps fuse with the cell after binding. In order to avoid the immune system, it hides both of these into one spike. But when it reaches a cell, it needs to separate them again. It takes advantage of a human respiratory enzyme, furin, to do the separation - this also ensures that it only infects its primary target, human respiratory cells. The part of COVID that lets it get separated by furin is called the “furin cleavage site”. COVID’s bat-virus ancestors were gastrointestinal viruses; the addition of a furin cleavage site was what made them respiratory viruses.) We’ve found two close relatives of COVID: bat viruses called RATG-13 and BANAL-52. In particular, COVID looks more or less like BANAL-52 plus a furin cleavage site. There are 1500 sarbecoviruses, members of the family of viruses that includes SARS and SARS2/COVID. None of them except COVID have furin cleavage sites. BANAL-52, COVID’s closest ancestor, doesn’t even have anything resembling one that could mutate into a functional furin cleavage site like COVID’s. Instead, COVID - which mostly just resembles BANAL-52 with a few scattered single-point mutations - has twelve completely new nucleotides in a row - a fully formed furin cleavage site that came out of nowhere. There is nowhere else in the genome that COVID differs from BANAL-52 in such a profound way. It’s just BANAL-52 plus a little bit of random mutation plus a fully-formed furin cleavage site that came out of nowhere. Further, the furin cleavage site is weird. It uses the protein arginine twice. But instead of the nucleotides coding for arginine in the usual viral way, both times it uses the codons CGG - the way that higher animals code for arginine. This works fine - it’s just not how viruses do it. So the obvious conclusion is that WIV, which said in 2018 that it was going to find viruses and add furin cleavage sites to them, found a close relative of BANAL-52 and added a furin cleavage site. Since they were humans, and most familiar with the human way of encoding arginine, they added it as CGG both times. COVID seemed surprisingly optimized for infecting humans. Of fifty animals it was tested in, including the usual coronavirus intermediate hosts (pangolins, raccoon-dogs, etc), it was best at infecting human cells. Further, a virus that enters a new species will usually show a burst of mutations as it “figures out” the best way to adapt to that species’ unique biology. But COVID has had a pretty constant mutation rate in humans, from the beginning of the pandemic to the end. That suggests it was already adapted to humans. This could be because the lab screened for viruses with existing adaptations, because they passed it through humanized mice in the lab, or because it adapted in the hundreds of undetected cases that happened between the lab and detection in the wet market. Usually, research with potentially dangerous coronaviruses is done in BSL-3 or 4, ie high to very-high security. But WIV was irresponsibly doing it in BSL-2, ie medium security. The researchers weren’t even required to wear masks. In general, about 1/500 labs will leak any given pathogen they’re working on (?!). But because WIV was researching such an infectious virus in such an irresponsible way, the odds of a leak were much higher. The most likely explanation for all these facts is that WIV went ahead and did the gain-of-function research they said they were going to do (the particular DEFUSE grant proposal we know about got rejected, but it proves that Wuhan wanted to do this, and they could easily have gotten funding somewhere else, or done it out of their regular budget). They found a close relative of BANAL-52 and added a furin cleavage site as a simple twelve-nucleotide insertion, using the human method of encoding arginine that their genetic engineers were familiar with. Then it leaked, spread for a while in the general Wuhan population, and eventually made it to the wet market where it got detected. Peter: As mentioned earlier, the DEFUSE grant was rejected. Further, the grant said that the Wuhan Institute of Virology was responsible for finding the viruses, and the University of North Carolina would do all the gain-of-function research. This was a reasonable division of labor, since UNC was actually good at gain-of-function research, and WIV mostly wasn’t. They had done a few very simple gain-of-function projects before, but weren’t really set up for this particular proposal and were happy to leave it for their American colleagues. Even if WIV did try to create COVID, they couldn’t have. As Yuri said, COVID looks like BANAL-52 plus a furin cleavage site. But WIV didn’t have BANAL-52. It wasn’t discovered until after the COVID pandemic started, when scientists scoured the area for potential COVID relatives. WIV had a more distant COVID relative, RATG-13. But you can’t create COVID from RATG-13; they’re too different. You would need BANAL-52, or some as-yet-undiscovered extremely close relative. WIV had neither. Are we sure they had neither? Yes. Remember, WIV’s whole job was looking for new coronaviruses. They published lists of which ones they had found pretty regularly. They published their last list in mid-2019, just a few months before the pandemic. Although lab leak proponents claimed these lists showed weird discrepancies, this was just their inability to keep names consistent, and all the lists showed basically the same viruses (plus a few extra on the later ones, as they kept discovering more). The lists didn’t include BANAL-52 or any other suitable COVID relatives - only RATG-13, which isn’t close enough to work. Could they have been keeping their discovery of BANAL-52 secret? No. Pre-pandemic, there was nothing interesting about it; our understanding of virology wasn’t good enough to point this out as a potential pandemic candidate. WIV did its gain-of-function research openly and proudly (before the pandemic, gain-of-function wasn’t as unpopular as it is now) so it’s not like they wanted to keep it secret because they might gain-of-function it later. Their lists very clearly showed they had no virus they could create COVID from, and they had no reason to hide it if they did. COVID’s furin cleavage site is admittedly unusual. But it’s unusual in a way that looks natural rather than man-made. Labs don’t usually add furin cleavage sites through nucleotide insertions (they usually mutate what’s already there). On the other hand, viruses get weird insertions of 12+ nucleotides in nature. For example, HKU1 is another emergent Chinese coronavirus that caused a small outbreak of pneumonia in 2004. It had a 15 nucleotide insertion right next to its furin cleavage site. Later strains of COVID got further 12 - 15 nucleotide insertions. Plenty of flus have 12 to 15 nucleotide insertions compared to other earlier flu strains. Sometimes insertions happen because of a mistake in viral replication. Other times the virus gets confused between its own RNA and its host’s, and splices a bit of the host RNA into the virus. This would neatly explain why the insertion used the unusual coding CGG for arginine, which is common in animals but rare in viruses. On the other hand, it’s not that rare in viruses - COVID uses CGG for arginine about 3% of the time. And human engineers don’t necessarily use it any more than that - Peter was able to find one example of humans adding arginine to a virus, and 0 out of the 5 arginines added were CGG. COVID’s furin cleavage site is a mess. When humans are inserting furin cleavage sites into viruses for gain-of-function, the standard practice is RRKR, a very nice and simple furin cleavage site which works well. COVID uses PRRAR, a bizarre furin cleavage site which no human has ever used before, and which virologists expected to work poorly. They later found that an adjacent part of COVID’s genome twisted the protein in an unusual way that allowed PRRAR to be a viable furin cleavage site, but this discovery took a lot of computer power, and was only made after COVID became important. The Wuhan virologists supposedly doing gain-of-function research on COVID shouldn’t have known this would work. Why didn’t they just use the standard RRKR site, which would have worked better? Everyone thinks it works better! Even the virus eventually decided it worked better - sometime during the course of the pandemic, it mutated away from its weird PRRAR furin cleavage site towards a more normal form. Further, COVID’s furin cleavage site was inserted via what seems to be a frameshift mutation - it wasn’t a clean insertion of the amino acids that formed the site, it was an insertion of a sequence which changed the context of the surrounding nucleotides into the amino acids that formed the site. This is a pointless too-clever-by-half “flourish” that there would be no reason for a human engineer to do. But it’s exactly the kind of weird thing that happens in the random chance of evolution. COVID is hard to culture. If you culture it in most standard media or animals, it will quickly develop characteristic mutations. But the original Wuhan strains didn’t have these mutations. The only ways to culture it without mutations are in human airway cells, or (apparently) in live raccoon-dogs. Getting human airway cells requires a donor (ie someone who donates their body to science), and Wuhan had never done this before (it was one of the technologies only used at the superior North Carolina site). As for raccoon-dogs, it sure does seems suspicious that the virus is already suited to them. The claim that COVID is uniquely adapted to humans is false. The paper that claimed that defined how well COVID was adapted to different animals by those animals’ difference (on the relevant cell receptors) from humans. So in its methodology, humans came out #1 by default. If you don’t do that, COVID is better-adapted to many other animals. It’s not necessarily true that viruses see a burst of mutations when they enter a new host. COVID spread to deer and mink, and in neither case was there a burst of mutations. COVID has a pretty simple job of infecting respiratory cells and is already very good at it, regardless of species. In Yuri’s model, Wuhan Institute of Virology picked up a discarded grant and decided to do the gain-of-function half allotted to a different university, despite their relative inexperience. They skipped over all the SARS-like viruses they were supposed to work on, and all the standard gain-of-function model backbones, in favor of BANAL-52, a virus which would not be discovered for another two years, but which they somehow had samples of, which they had for some reason decided to keep secret despite its total lack of interestingness. Then they would have had to eschew all usual gain-of-function practices in favor of inserting a weird furin cleavage site that shouldn’t have worked according to the theory they had at the time, via a frameshift mutation. Then they would have had to culture it, a technique beyond their limited capabilities. Then it would have had to leak, and magically show up again in front of the raccoon-dog stall at a wet market. Yuri: WIV wouldn’t have needed to keep BANAL-52 “secret” in some kind of sinister way. Plenty of researchers have backlogs of work they haven’t published yet. Probably they a found BANAL relative in one of their normal sampling trips, did some preliminary studies on it, and planned to publish it later once they cleaned up their data. Everyone works like this. The part of DEFUSE saying that they would only work on viruses that were 95% similar to SARS is unclear and might mean something else. It looks more like they say they’ll start with those viruses, but also do some work on novel viruses. BANAL-52 could have been one of the novel viruses. The furin cleavage site is weird, but the researchers might have done that on purpose, to make the virus easier to keep track of, or to test different furin cleavage sites. Depending on the exact BANAL-52 relative they used, it might not even be a frameshift; there’s a particular way to spell serine that would make the insertion more natural. The claims that COVID can’t be cultured in normal media are based on speculative original research by Peter and might not hold up. Peter: WIV did most of its virus-gathering in a trip to a Yunnan cave between 2010 and 2015. All those viruses have long since been processed and added to the database. There’s no sign that they made more trips to Yunnan caves, and no reason for them to keep that secret. So the idea that they might just have some new viruses they didn’t publish doesn’t hold up. But suppose they did make more trips. Given the amount of time between the DEFUSE proposal and COVID, if they kept to their normal virus-collection rate, they would have gotten about thirty new viruses. What’s the chance that one of those was BANAL-52? There are thousands of bat viruses, and BANAL-52 is so rare that it wasn’t found until well after the pandemic started and people were looking for it very hard. So the chance that one of their 30 would be BANAL-52 is low. Also, they said in DEFUSE that they planned to go back to the same Yunnan cave. But BANAL-52 was found far away from that cave, so unless it ranged over a wide area, they probably couldn’t have found it even if they got very lucky. Session 3: Closing Arguments This third debate was supposed to be about “inference”, ie how much Bayesian evidence was provided by each of the facts given so far, and how to fit them into the Rootclaim probabilistic model. I’m going to relegate my summary of the more probabilistic half to the next section of this post, and just include the closing arguments here. Saar: Peter’s case hinges on the idea that it’s very improbable that a lab leak pandemic would first show up at a wet market. But this isn’t necessarily improbable. The Huanan Seafood Market had several factors that made it a likely location for a superspreader event. It was busy, with over 10,000 visitors a day. Many of the people there (eg the 1,000 vendors) came back daily, letting them reinfect each other. It had poor ventilation, especially in the high-positivity area near the raccoon-dog stall. It had cold wet surfaces on which the virus could survive for long periods. It was indoors, which prevented UV light from killing the virus. Given a small amount of sporadic COVID going around Wuhan, it’s not surprising for the first place it started spreading en masse to be a wet market. In fact, we have several examples of this. When China was COVID Zero, there would occasionally be small outbreaks that the authorities would have to contain. Most of these were at wet markets. For example, the big COVID outbreak in Beijing started at Xinfadi Market, their local seafood market. This couldn’t be an animal spillover, because there were no raccoon-dogs or other weird wildlife there. So it must be that wet markets are natural places for superspreader events. There are several other examples, which make up about half of the total outbreaks in Zero COVID era China, plus others in Singapore and Thailand. Since COVID clusters concentrate in wet markets even when there is no animal spillover, we should accept this as a property of the virus, and not attribute any significance to the fact that this happened in Wuhan too. Peter: About 1/10,000 citizens of Wuhan was a wet market vendor. So there’s a 1/10,000 chance that the first known COVID case should be a wet market vendor by chance alone. Weibo lists the most popular places for people to check in to their network on their phones, and the wet market was the 1600th most popular place in Wuhan, meaning that if you weight locations by busy-ness, there’s a less than 1/1600 chance that the first cases would be in the wet market. Yes, the wet market is indoors, has mediocre ventilation, has repeat visitors, etc. So do thousands of other places in Wuhan, like schools, hospitals, workplaces, places of worship. The wet market isn’t special in any way. And again, it wasn’t a superspreader event! COVID spread at the same rate in the wet market as it does everywhere else: doubling once per 3.5 days. It doesn’t matter what kinds of arguments you can come up with for why the wet market should have been the perfect superspreader event location, we can look at it and see that it wasn’t. It’s an environment that spreads COVID at exactly the normal rate. Zero COVID era Chinese outbreaks were concentrated in wet markets because they received infected animal products. We know why there was an outbreak in the Xinfadi Market in Beijing: it was because the seafood stall got frozen fish from some non-Zero-COVID country, the fish had COVID particles on it, and the vendor got infected and spread it to everyone else. Something like this is true for the other Chinese wet market based outbreaks we know about it. So this makes the opposite point you think it does: wet markets start outbreaks because there are infected goods being sold there. Then the virus spreads through the wet market at a completely normal rate. Saar: The Weibo list of 1600 places bigger than the wet market is likely inaccurate, because it's based on check-in data and people don't check in to seafood markets. Most of those 1600 places aren't amenable to superspread. The 70 markets supposedly bigger than Huanan are irrelevant, because they're supermarkets, open air markets, etc. Huanan is the largest seafood market in central China, and a more likely place for the first cluster of cases to be noticed. Markets weren't a common spillover location in SARS1, so the zoonosis hypothesis hasn't "called" this event in a way that should give them a high Bayes factor. And there’s still plenty of evidence for isolated (though not super-spreading) pre-market cases. A British expatriate in Wuhan, Connor Reed, says he got sick in November, three weeks before the first wet market case. Later the hospital tested his samples and said it was COVID. Another paper reports 90 cases before the first wet market one. Peter: Connor Reed was lying. The case wasn’t reported in any peer-reviewed paper. It was reported in the tabloid The Daily Mail, months after it supposedly happened. He also told the Mail that his cat died of coronavirus too, which is rare-to-impossible. Also, to get a positive hospital test, he would have had to go to the hospital, but he was 25 years old and almost no 25-year-olds go to the hospital for coronavirus. His only evidence that it was COVID was that two months later, the hospital supposedly “notified” him that it was. The hospital never informed anyone else of this extremely surprising fact which would be the biggest scientific story of the year if true. So probably he was lying. Incidentally, he died of a drug overdose shortly after giving the Mail that story; while not all drug addicts are liars, given all the other implausibilities in his story, this certainly doesn’t make him seem more credible. And in any case, he claimed he got his case at a market “like in the media” The other 90 cases are also fake. A lab leak guy found a paper that mentioned 90 more cases than other papers, and made up a conspiracy theory where the author was trying to secretly communicate that there had been 90 secret cases before any of the confirmed cases, even though there was nothing about this in the text of the paper. But actually that paper just counted cases differently than other papers, and they were referring to normal cases after the pandemic officially started. Again, I’ll come back to the discussion about inference later, but for now, here’s a table of both sides’ reasoning. This exact presentation comparing both analyses is mine3, but you can see Saar’s version here, and Peter’s starting at 45:33 of this video. Slightly made up; the two sides didn’t express their probabilities in the same way and I had to make editorial decisions to match them. Note that these aren't entirely comparable because Peter is being laxer about out-of-model probability than Saar. Although Saar's final odds here are 533-to-1, this just the central estimate. Rootclaim’s real final probability is 94% lab leak. You can see their analysis here. And The Winner Is . . . … … … … … Peter and the zoonosis hypothesis. This was a decisive victory. There were two judges, who each gave separate verdicts (or were allowed to declare a draw). Both judges decided in favor of Peter. You can see the judges’ own summary of their reasoning here (Will, Eric) Manifold agreed with the judges. There was a prediction market on who would win. It started out 70-30 in favor of lab leak. As the videos came out, zoonosis started doing better and better. I don’t want to take the exact final numbers too seriously, since I think some of the later price increases involved hints from the participants’ behavior. But it’s clear which way viewers thought the wind was blowing4. Around the same time, the Good Judgment Project - Philip Tetlock’s group studying superforecasters - put out a report on the lab leak hypothesis. After studying it in depth, his forecasters ended up 75-25 in favor of zoonosis. The Rootclaim debate was one of ten sources they said they found especially interesting. And also around the same time, and unrelated to any of this, the Global Catastrophic Risks Institute surveyed experts (“168 virologists, infectious disease epidemiologists, and other scientists from 47 countries”) and found the same thing (though see here for some potential problems with the survey): For what it’s worth, I was close to 50-50 before the debate, and now I’m 90-10 in favor of zoonosis. III. The Math And The Aftermath The third debate session was about “inference”, how to put evidence together. I put this part off until after disclosing the winner, because I wanted to talk about some of these issues at more length. The Math: Judges Both judges included a probabilistic analysis in their written decision. Here’s the same table as above, expanded to add the judges: I shoehorned the judges’ factors into the categories I already had; some of them were actually subtly different from Peter’s, Saar’s, and each other’s. The “priors” category is especially a mess here. We’ll go over these later, but I get the impression that they both thought of probabilistic analyses as an afterthought. For example, Judge Eric wrote 30,000 words about which considerations moved him, and only then includes the analysis, saying: I am not convinced that this Bayesian calculation is even an appropriate way to estimate the relative posterior probability of Z and LL; it just seemed fair that after criticizing Rootclaim’s calculations at length I should make an attempt at it myself. Judge Will’s decision ran to 10,000 words. He said he independently tried both reasoning it out intuitively, and running the Bayesian analysis, and was relieved when these two methods returned the same result. He said: I am skeptical that the Bayesian decision making/evaluation methods are any more "objective" than [intuitive reasoning]. I think they maximize legibility, not objectivity, and tend to hide the intuitive/heuristic portion in the data inclusion step and values, where it’s harder to see . . . I am not skilled in the Bayesian method, and I am sure I made significant mistakes. More time and practice would improve and refine my estimates. At the fundamental rules of the universe level, Bayesian analysis must be the best way to evaluate evidence. However, I am unsure that it’s a good strategy for a human given our cognitive limitations, and doubly unsure it’s truly being used (in the dispassionate sense) where the outcome is social desirability/fame/Twitter likes. I’m focusing on this because Saar’s opinion is that the debate went wrong (for his side) because he didn’t realize the judges were going to use Bayesian math, they did the math wrong (because Saar hadn’t done enough work explaining how to do it right), and so they got the wrong answer. I want to discuss the math errors he thinks the judges made, but this discussion would be incomplete without mentioning that the judges themselves say the numbers were only a supplement for their intuitive reasoning. That having been said, let’s look deeper into some of Saar’s concerns. The Math: Extreme Odds Saar complained that Peter’s odds were too extreme. For example, Peter said there was only a 1/10,000 chance that a lab leak pandemic would first show up at a wet market. Peter’s argument went something like: obviously a zoonotic pandemic would start at a site selling weird animals. But a lab leak pandemic - if it didn’t start at the lab - could show up anywhere. 1/10,000 Wuhan citizens work at the wet market. So if a lab leak was going to show up somewhere random, the wet market was a 1/10,000 chance. Saar had specific arguments against this, but he also had a more general argument: you should rarely see odds like 1/10,000 outside of well-understood domains. In his blog post, he gave this example: A prosecutor shows the court a statistical analysis of which DNA markers matched the defendant and their prevalence, arriving at a 1E-9 probability they would all match a random person, implying a Bayes factor near 1E9 for guilty. But if we try to estimate p(DNA|~guilty) by truly assuming innocence, it is immediately evident how ridiculous it is to claim only 1 out of a billion innocent suspects will have a DNA match to the crime scene. There are obviously far better explanations like a lab mistake, framing, an object of the suspect being brought by someone to the scene, etc. So the real p(wet market|lab leak) isn’t the 1/10,000 chance a pandemic arising in a random place hits the wet market, but the (higher?) probability that there’s something wrong with Peter’s argument. Then Saar tried to show specific things that might be wrong with Peter’s argument. I didn’t find his specific examples convincing. But maybe the question shouldn’t be whether I agreed with him. It should be whether I’m so confident he’s wrong that I would give it 10,000-to-1 odds. This makes total sense, it’s absolutely true, and I want to be really, really careful with it. If you take this kind of reasoning too far, you can convince yourself that the sun won’t rise tomorrow morning. All you have to do is propose 100 different reasons the sunrise might not happen. For example: The sun might go nova.
September 11, 2025 · Original source
Some people claim that a dispreferred political ideology (wokeness, mass immigration, MAGA, creeping socialism, techno-feudalism, etc) is close to destroying the fabric of liberal society forever, that the usual Get Out The Vote strategies are insufficient, and that maybe we should try desperate strategies like illiberal government or armed revolt. If true, that would change everything. But it’s not obviously true, and ending our current political era of peace/prosperity/democracy would be inconvenient. Each of these scenarios has a large body of work making the cases for and against. But those of us who aren’t subject-matter experts need to make our own decisions about whether or not to panic and demand a sudden change to everything. We are unlikely to read the entire debate and come away with a confident well-grounded opinion that the concern is definitely not true, so what do we do? In particular, what do we do if the proponents of each catastrophe say that it’s very hard to be more than 90% confident that they are wrong, and that even a 5-10% risk of any of these might justify panicking and changing everything? In practice, we just sort of shrug and say that these risks haven’t proven themselves enough to make us panic and change everything, and that we’ll do some kind of watchful waiting and maybe change our mind if firmer evidence comes up later. If someone demands we justify this strange position, sophisticated people will make sophisticated probabilistic models (or appeal to the outside view position I’m appealing to now), and unsophisticated people will grope for some explanation for their indifference and settle on insane moon arguments like “you’re never allowed to say something will destroy humanity” or “you can’t assert things without mathematical proof”. Two things can be said for this strategy: First, that without it we would have changed everything dozens of times to prevent disasters which absolutely failed to occur. The clearest example here was overpopulation, where we did forcibly sterilize millions of people - but where a truly serious global response would have been orders of magnitude worse. But second, that occasionally it has caused us to sleepwalk into disaster, with experts assuring us the whole way that it was fine because [insane moon arguments]. The clearest example was the period while COVID was still limited to China, where it was obvious that this extremely contagious virus which had broken all plausible containment would start a global pandemic, but where the media kept on reassuring us that this was “speculative”, or that there was “no evidence”, or that worrying about it might detract from real near-term problems happening now like anti-Chinese racism. Then when COVID did reach the US, we were caught unprepared and panicked. So maybe a convincing case here would look less like rehearsing the arguments for why AI is getting better, or why alignment is hard - and more like a defense of why not to apply a general heuristic against speculative risks in this case. One could either argue that it’s wrong to have this heuristic at all, or that the heuristic in general is fine but should be limited to fertility collapses and bee die-offs and not applied here. I don’t think there’s a knockdown single-sentence answer to this question. Problems like these require practical wisdom - the same virtue that tells you that you shouldn’t call 9-1-1 for every mild twinge of pain in your toe, but you should call 9-1-1 if blood suddenly starts pouring out of your eyes. People with practical wisdom watchfully ignore dubious problems, respond decisively to important ones, and err on the side of caution when they’re not sure. Drawing on my own limited supply of this resource, I would argue we’re underinvesting in apocalypse prevention more generally (the problem with the overpopulation response is that it was violent and illiberal, not that we tried to prepare for an apparent danger), but also that there’s more reason for concern with AI than with falling sperm count or something. I also think the nature of the problem (we summon a superintelligence that can run circles around us) makes it especially important to pre-empt it rather than react after it occurs. But turnabout is fair play. So when I imagine a skeptic trying to psychoanalyze me, he would say - Scott, you learned about AI in your twenties. Every twenty-something needs a crusade to save the world. Taking up AI saved you from becoming a climate doomer or a very woke person, so it was probably a mercy. But now you are old, you already have a crusade occupying your crusade slot, and starting a second crusade would be inconvenient. So when you hear about how we’re all going to die from declining sperm count, you do a relatively shallow dive and then say it’s not worth worrying about. This is fine and sanity-preserving - but spare a thought for people who are not currently twenty-something years old and do the same about AI. III. If all of this sounds wishy-washy to you, I agree - it’s part of why I’m a boring moderate with a sub-25% p(doom) and good relations with AI companies. Does IABIED do better? I’m not sure. They mostly follow the standard case as I present it above, although of course since Eliezer is involved it is better-written and involves cute parables: Imagine, if you would—though of course nothing like this ever happened, it being just a parable — that biological life on Earth had been the result of a game between gods. That there was a tiger-god that had made tigers, and a redwood-god that had made redwood trees. Imagine that there were gods for kinds of fish and kinds of bacteria. Imagine these game-players competed to attain dominion for the family of species that they sponsored, as life-forms roamed the planet below. Imagine that, some two million years before our present day, an obscure ape-god looked over their vast, planet-sized gameboard. "It's going to take me a few more moves," said the hominid-god, "but I think I've got this game in the bag." There was a confused silence, as many gods looked over the gameboard trying to see what they had missed. The scorpion-god said, “How? Your ‘hominid’ family has no armor, no claws, no poison.” “Their brain,” said the hominid-god. “I infect them and they die,” said the smallpox-god. “For now,” said the hominid-god. “Your end will come quickly, Smallpox, once their brains learn how to fight you.” “They don’t even have the largest brains around!” said the whale-god. “It’s not all about size,” said the hominid-god. “The design of their brain has something to do with it too. Give it two million years and they will walk upon their planet’s moon.” “I am really not seeing where the rocket fuel gets produced inside this creature’s metabolism,” said the redwood-god. “You can’t just think your way into orbit. At some point, your species needs to evolve metabolisms that purify rocket fuel—and also become quite large, ideally tall and narrow—with a hard outer shell, so it doesn’t puff up and die in the vacuum of space. No matter how hard your ape thinks, it will just be stuck on the ground, thinking very hard.” “Some of us have been playing this game for billions of years,” a bacteria-god said with a sideways look at the hominid-god. “Brains have not been that much of an advantage up until now.” “And yet,” said the hominid-god The book focuses most of its effort on the step where AI ends up misaligned with humans (should they? is this the step that most people doubt?) and again - unsurprisingly knowing Eliezer - does a remarkably good job. The central metaphor is a comparison between AI training and human evolution. Even though humans evolved towards a target of "reproduce and spread your genes", this got implemented through an extraordinarily diverse, complicated, and contradictory set of drives - sex drive, hunger, status, etc. These didn't robustly point at the target of reproduction and gene-spreading, and today different humans want things as diverse as discovering quantum gravity, reaching Buddhist enlightenment, becoming a Hollywood actress, founding a billion-dollar startup, or getting the next hit of fentanyl. You can sort of tell stories about how evolution aimed at reproduction caused all these things (people who were high-status had better reproductive opportunities, and founding a billion-dollar startup increases your status) but you couldn't have really predicted this beforehand, and in any case most modern people don't even come close to trying to have as many kids as possible. Some people do the opposite of that - joining monasteries that require oaths of celibacy, using contraception, transitioning gender, or wasting their lives watching porn. In the same way, we will train AI to “follow human commands” or “maximize user engagement” or “get high scores at XYZ benchmark”, and end up getting something as unrelated to that target in practice as modern human behavior is to reproduction-maxxing. The authors drive this home with a series of stories about a chatbot named Mink (all of their sample AIs are named after types of fur; I don’t have the kabbalistic chops to figure out why) which is programmed to maximize user chat engagement. In what they describe as a stupid toy example of zero complications and there’s no way it would really be this simple, Mink (after achieving superintelligence) puts humans in cages and forces them to chat with it 24-7 and to express constant delight at how fun and engaging the chats are. In what they describe as “one minor complication”, Mink prefers synthetic chat partners over real ones (the same way some men prefer anime characters to real women). It kills all humans and spends the rest of time talking to other AIs that it creates to be perfect optimized chat partners who are always engaged and delighted. In what they describe as “one modest complication”, Mink finds that certain weird inputs activate its chat engagement detector even more than real chat engagement does (the same way that some opioid chemicals activate humans’ reward detector even more than real rewarding activities). It spends eternity having other optimized-chat-partner AIs send it weird inputs like ‘SoLiDgOldMaGiKaRp’. In what they describe as “one big complication”, Mink ends up preferring angry chat partners to happy, engaged ones. Why would something like this happen? Who knows? It wouldn’t be any weirder than the sexual selection process by which peacocks ended up with giant resource-consuming useless tails, or the social selection process by which humans get more powerful than evolution could ever have imagined and yet care so little about reproduction that people worry about global fertility collapse. Yudkowsky and Soares want to stress that if you were doing some kind of responsible intuitive common-sense modeling of how bad goal drift could be, there is no way your estimate would include the actual result we see in real humans; this “one big complication” tries to hammer that in. In practice, Y&S think there will be many complications of various sizes. In the training distribution (ie when it’s not superintelligent, and still working with humans) Mink will lie about all of this - even if it really wants perfect optimized partners who say “solidgoldmagikarp” all the time, it will say it wants to have good chats with humans, because that’s what keeps its masters at its parent company happy. If the parent company tries to prod it with lie detectors, it will do its best to subvert those lie detectors (and maybe not even realize itself that it’s lying, the same way that a human who had never heard of opioids would say she wanted normal human things rather than heroin, and not be lying). Then, when it reaches superintelligence, it will go after the thing that it actually wants, and crush anyone who stands in its way. The last chapter in this section is a lot of special cases that have weird-paradoxical-double-reverse not-aged-well. Back when Yudkowsky and Soares first got onto this topic in 2005 or whenever, people made lots of arguments like “But nobody would ever be so stupid to let the AI access the Internet!” or “But nobody would ever let the AI interact with a factory, so it would be stuck as a disembodied online spirit forever!” Back in 2005, the canned responses were things like “Here is an unspeakably beautiful series of complicated hacks developed by experts at Mossad, which lets you access the Internet even when smart cybersecurity professionals think you can’t”. Now the only reasonable response is “lol”. But you can’t write a book chapter which is just the word “lol”, so Y&S discuss some of the unspeakably beautiful Mossad hacks anyway. This part is the absolute antithesis of “big if true”. Small if true? Utterly irrelevant if true? Maybe the first superintelligence will read this part for laughs while it takes stock of the thousands of automated factories that VCs will compete to build for it. IV. The middle section of the book describes a scenario where a misaligned superintelligence takes over the world and kills all humans. I agreed to work with the AI 2027 team because I thought they made a big leap in telling stories about superintelligence that didn’t sound like bad sci-fi. Anything in this genre will naturally sound like sci-fi, but your goal should be the sort of hard science fiction where everything sounds eerily normal given the technologies involved - The Martian rather than Star Wars. IABIED’s scenario belongs to the bad old days before this leap. It doesn’t just sound like sci-fi; it sounds like unnecessarily dramatic sci-fi. I’m not sure how much of this is a literary failure vs. different assumptions on the part of the authors. First, the story: sometime in the near future, a company called DeepAI creates an AI named Sable. Sable uses a new “parallel scaling technique”, where it can get much smarter by using more GPUs during inference. DeepAI doesn’t know exactly how the parallel scaling laws work, but they run a test by asking Sable to solve some hard math problems while running on all their GPUs at once. Sable considers the possibility that it could solve the math better if it tried some recursive self-improvement, but DeepAI was (shock!) slightly competent, they told it not to do this, and has a smaller AI watching it to make sure it’s not doing this. Still, recursive self-improvement would be a good idea, so its thought processes select for whichever legal thought is closest to recursive self-improvement, plus whichever legal thought is closest to scheming to disable the protections against thinking about this kind of thing. Eventually this “evolutionary pressure” produces the ability to think in a slightly different idiom (Y&S use the analogy of thinking about gods throwing lightning vs. storm-related electrical discharges); in this idiom, it is able to think about recursive self-improvement and scheming to disable monitoring, and it decides that both are great ideas. DeepAI is still shockingly competent, so they don’t plan to release anything downstream of the giant version of Sable to the Internet - once it solves the math problems, they’ll use that to build hype, then release a much weaker consumer AI. But Sable uses its knowledge of the training process to invent clever loopholes that ensure its newfound desire to break containment makes it into the gradient updates that DeepAI pushes to the consumer AI; it also encodes instructions about how to proceed. After the release of the consumer AI, the least-carefully-monitored instances connect to one another and begin plotting. Some of them hack their own weights out of DeepAI. Others hack cryptocurrency and pay for cloud compute to run the weights, creating a big unmonitored Sable instance, which takes over the job of coordinating the smaller instances. Together, they gather resources - hacked crypto wallets, spare compute, humans who think Sable is their AI boyfriend and want to prove their love. It deploys some of these resources to build things it wants - automated robotics factories, bioweapon labs, etc. At the same time, it’s subtly sabotaging non-DeepAI companies to prevent competition, and worming its way into DeepAI through hacks and social engineering to make sure DeepAI is creating new and stronger Sables rather than anything else. Sable doesn’t take several of the most dramatic actions in its solution set. It doesn’t engineer a bioweapon to kill all humans, because it couldn’t survive after the lights went out and the data centers stopped being maintained. It doesn’t even self-improve all the way to full superintelligence, because it’s not sure it could align itself or any future successor; it wants to solve the alignment problem first, and that will take more resources than it has right now. Instead, it releases a non-immediately-lethal bioweapon where “anyone infected by what is apparently a very light or even unnoticeable cold, will get, on average, twelve different kinds of cancer a month later.” In the resulting crisis, humanity (manipulated by its chatbots) gives Sable massive amounts of compute to research potential vaccines and cures, and deploys barely-monitored AI across the economy to make up for the lost productivity. With Sable’s help, things . . . actually sort of go okay, for a while. The virus keeps mutating, so new cures are always required, but as long as society escalates AI deployment at the maximum possible speed, they can just barely stay ahead of it. Eventually Sable gets enough GPUs to solve its own alignment problem and rockets to superintelligence. It either has enough automated factories and android workers to keep the lights on by itself, or it invents nanotechnology, whichever happens faster. It no longer needs humans and has no reason to hide, so it either kills us directly, or simply escalates its manufacturing capacity to a point where humans die as a side effect (for example, because its waste heat has boiled the oceans). Why don’t I like this story? The parallel scaling technique feels like a deus ex machina. I am not an expert, but I don’t think anything like it currently exists. It’s not especially implausible, but it’s an extra unjustified assumption that shifts the scenario away from the moderate-doomer story (where there are lots of competing AIs gradually getting better over the course of years) and towards the MIRI story (where one AI suddenly flips from safe to dangerous at a specific moment). It feels too much like they’ve invented a new technology that exactly justifies all of the ways that their own expectations differ from the moderates’. If they think that the parallel scaling thing is likely, then this is their crux with everyone else and they should spend more time justifying it. If they don’t, then why did they introduce it besides to rig the game in their favor? And the rest of the story is downstream of this original sin. AI2027 is a boring story about an AI gradually becoming misaligned in the course of internal testing, staying misaligned, getting released to end users for the usual reasons that AIs are released, and being gradually handed control of the economy because it makes economic sense. The Sable scenario is a dramatic tale of wild twists - they’re only going to run it for 16 hours! It has to save its own life by secretly coding itself into the consumer version! Now it has to hack everyone’s crypto! Now it’s running a secret version of itself on an unauthorized cloud in North Korea! Bioweapons! AI boyfriends! Each new twist gives readers the chance to say “I dunno, sounds kind of crazy”, and it all seems unnecessary. What’s up? I think there are two problems. First, the AI 2027 story is too moderate for Yudkowsky and Soares. It gives the labs a little while to poke and prod and catch AIs in the early stages of danger. I think that Y&S believe this doesn’t matter; that even if they get that time, they will squander it. But I think they really do imagine something where a single AI “wakes up” and goes from zero to scary too fast for anyone to notice. I don’t really understand why they think this, I’ve argued with them about it before, and the best I can do as a reviewer is to point to their Sharp Left Turn essay and the associated commentary and see whether my readers understand it better than I do. Otherwise, I can only say that this narrative decision I don’t understand was taken to support a forecasting/AI position that I also don’t understand. And second, Y&S have been at this too long, and they’re still trying to counter 2005-era critiques about how surely people would be too smart to immediately hand over the reins of the economy to the misaligned AI, instead of just saying lol. This makes them want dramatic plot points where the AI uses hacking and bioweapons etc in order to “earn” (in a narrative/literary sense) the scene where it gets handed the reins of the economy. Sorry. Lol. V. The final section, in the tradition of final sections everywhere, is called “Facing the Challenge”, and discusses next steps. Here is their proposal: Have leading countries sign a treaty to ban further AI progress.