coronavirus
Article
coronavirus is a recurring concept in the Astral Codex Ten archive, appearing 11 times across 11 issues between February 01, 2021 and April 09, 2024. The archive places it in contexts such as “This week: the coronavirus”; “If you’re planning the coronavirus response”; “Vitamin D on coronavirus incidence or severity”. It most often appears alongside COVID, CDC, China.
Metadata
- Category: Concepts
- Mention count: 11
- Issue count: 11
- First seen: February 01, 2021
- Last seen: April 09, 2024
Appears In
- Metaculus Monday
- WebMD, And The Tragedy Of Legible Expertise
- Vitamin D: Much More Than You Wanted To Know
- Links For March
- Book Review: Antifragile
- The Rise And Fall Of Online Culture Wars
- 26
- Adumbrations Of Aducanumab
- Galton, Ehrlich, Buck
- Practically-A-Book Review: Rootclaim $100,000 Lab Leak Debate
- Highlights From The Comments On The Lab Leak Debate
Related Pages
-
- COVID (6 shared issues)
-
- CDC (4 shared issues)
-
- China (4 shared issues)
-
- New York Times (4 shared issues)
-
- Anthony Fauci (3 shared issues)
-
- FDA (3 shared issues)
-
- Metaculus (3 shared issues)
-
- Osama bin Laden (3 shared issues)
-
- Scott (3 shared issues)
-
- Stanford (3 shared issues)
-
- The New York Times (3 shared issues)
-
- US (3 shared issues)
External Links
Source Context
Recovered passages from the original issue text. When the raw archive preserved outbound links inside the source passage, they are listed directly under the quote.
This week: the coronavirus.
Late last year, when coronavirus had already killed 285,000 Americans, Metaculus asked users to predict how many would be dead by the end of 2021. The guesses started at about 500,000. But as cases rose further through December and January, the guesses rose too, until now they're averaging almost 690,000 people.
Inline links: asked users to predict
When will some official like the Director of the CDC announce that a coronavirus vaccine is available to any adult who wants it (as opposed to just front-line workers, only seniors, etc)? The question as asked is a little odd, since it will probably be available in some areas before others, but this is apparently about some kind of nationwide announcement.
This is actually a widespread problem in medicine. The worst offender is the FDA, which tends to list every problem anyone had while on a drug as a potential drug side effect, even if it obviously isn't. This got some press lately when Moderna had to disclose to the FDA that one of the coronavirus vaccine patients got struck by lightning; after a review, this was declared probably unrelated. For the more serious version of this, read Get Ready For False Side Effects. Why does the FDA keep doing this if they know it makes their label information useless? My guess is it's because they don't want to look like cowboys who unprincipledly consider some things but not other things. What if someone accused the person deciding what things to consider of being biased? So the FDA comes up with a Procedure, and once you have a Procedure it has to be "take everything seriously", and then it falls on random small-fry people who aren't the FDA to pick up the slack and explain which side effects are worth worrying about or not, and then those small fries don't do that, because they could get sued.
Inline links: got struck by lightning, Get Ready For False Side Effects
The way I imagine this is that Zvi reads some papers on whether the coronavirus has airborne transmission, sees the direction they're leaning, and announces on his blog that it probably has airborne transmission.
If you're planning the coronavirus response, maybe the best thing you can do is lock Zvi in a cave completely incommunicado and make him write one for you. The moment there's a gap in the cave, thousands of lobbyists and activists and politicians will rush in, trying to sue him or bribe him or threaten him or guilt him into changing it to favor their constituency. If he has the slightest shred of self-preservation, the end result will be some balance between the good plan he would have written earlier, and the stuff he needs to include to avoid getting sued or fired or cancelled or universally-loathed or mobbed.
Most health articles ask you to act on their opinions. I am specifically asking you not to act on mine. In a moment, I'll tell you whether or not I think Vitamin D prevents or treats coronavirus. But I'll give you a free spoiler: I am less than 100% certain of what I'm about to say. So if you want to take Vitamin D, take it. If it does prevent or cure coronavirus, great. If not, the worst that will happen is you'll have slightly better bone health. I can't stress how much I don't want to be those people who said they couldn't prove face masks helped so you must not use face masks. Just ignore everything I'm saying, do a quick cost-benefit calculation, and take Vitamin D. That having been said:
Lots of people think Vitamin D treats coronavirus, and some of them have good evidence. For example, infection rate from coronavirus seems latitude dependent; in general, the further north an area, the worse it's been hit. Northern areas get less sunlight, and sunlight helps produce Vitamin D, so whenever you see a disease that's worse at high latitudes, Vitamin D should be on your short list of potential causes.
Also - in the US, COVID seemed to remit with the summer and worsen over the winter. It's hard to distinguish this from general exponential growth and from the effect of playing ping-pong with gradually loosening/ tightening lockdowns, but the US spike this winter was pretty dramatic. Most Northern Hemisphere countries show such a pattern, most equatorial countries don't, and some Southern Hemisphere countries arguably show the opposite. Whenever you see a disease that's better in summer and worse in winter, Vitamin D is one of the possible culprits.
Inline links: seemed to
13: CTRL+F “Blackrock” in this Matt Levine column for a discussion of how we accidentally stumbled into true communism for the good of all. The short version: an investing company called Blackrock owns so much of the economy that it’s in their self-interest to have all companies cooperate for the good of the economy as a whole. While they don’t usually push this too hard, the coronavirus pandemic was a big enough threat that “BlackRock is actually calling drug companies and telling them to cooperate to find a cure without worrying about credit or patents or profits”.
Inline links: in this Matt Levine column
33: Coronavirus can cause loss of smell. A post on Tumblr claims you can track the course of the coronavirus pandemic by measuring average number of stars in Yankee Candle reviews - during peaks in infection, there are lots of bad reviews saying the candles have no scent.
Inline links: average number of stars in Yankee Candle reviews
34: Good news! The FDA has decided that slight changes to coronavirus vaccines to respond to new variants won’t need lengthy clinical trials - this was what I called the “best-case scenario” in my recent coronavirus thread. Credit where credit is due.
Inline links: won’t need lengthy clinical trials, my recent coronavirus thread
Other times it's harder. To choose an example close to my own heart, is it really true - as asserted without argument on page 422 - that Spartan hoplites are antifragile but bloggers are fragile? Spartan hoplites are good at war, which is a sort of disorder (though it's not clear they exactly benefit from it). But in other ways they seem quite fragile. Even slight deviations from their ideal conditions (flat open ground, with a slow enemy lumbering toward them from the front) would knock them off balance. A single break in the ranks would doom them. If a flood or avalanche hit, being stuck in unwieldy armor would assure them a swift death. As for bloggers, during all the greatest crises of the past few years - Trump's election, the BLM protests, coronavirus - my hit count skyrocketed, as people looked for writing that would help them make sense of the situation. What could be a purer example of gaining from volatility?
The post was A Failure, But Not Of Prediction, and it argued that getting your predictions right was less important than calculating payoffs right. For example, if some very smart scientists tell you that there's an 80% chance the coronavirus won't be a big deal, you thank them for their contribution and then prepare for the coronavirus anyway. In the world where they were right, you've lost some small amount of preparation money; in the world where they were wrong, you've saved hundreds of thousands of lives.
Inline links: A Failure, But Not Of Prediction
Maybe this doesn't work in investing, but does work in real life? But I'm still not seeing it. Sure, banking is fragile and taxi driving is antifragile, but this has already been priced in - investment bankers make big bucks partly to compensate them against the fact that they might get fired after a few years. Exercise is antifragile, but the amount of exercise people do right now already prices in the fact that it will make them healthier and more muscular. Preparing against coronavirus is antifragile, and, um - maybe you could argue that there's an optimal amount of preparation to do before it becomes excessive, and suggesting you do more is implicitly assuming you're not at that point yet?
I think they irony-ed themselves so hard that they accidentally ended up as Nazis. You could always go to 4chan and find people saying horrible racist things ironically. That was the whole point of the anonymous message board - people would troll each other by saying the most awful thing they could think of, and whoever took it seriously first lost. Last I heard from them they were trying to use meme magic to make the coronavirus kill as many people as possible. This is impossible to take seriously, and for a long time their racism was the same way - a lot of the supposedly anti-Semitic posts bore obvious fingerprints of having been written by Jews who were having a fun time laughing at themselves. But if you're in a community where everybody puts a lot of effort into pretending to be pro-racist all the time, and where any breaking of kayfabe gets punished, eventually some people who aren't in on the joke just get actually racist, and if you're so devoted to edginess that you can't politely take those people aside and explain, eventually that takes over the culture. I know this is a weird theory, but Kurt Vonnegut was a smart guy and he said "Be careful who you pretend to be, because you are who you pretend to be".
Some of the more interesting new Metaculus markets. The space telescope one is especially interesting in the context of whether we could use prediction markets to predict (and maybe manage) government delays and cost overruns. The telescope is currently scheduled for launch in October 2025, so the market expects it to be about five years late. For context, the previous space telescope, James Webb, was originally scheduled for 2007 and (if everything goes well) will launch later this year. God Help Us, Let’s Try Predicting The Coronavirus Some More Anxiety is growing about the new Delta variant of coronavirus. What do the prediction markets say?
Inline links: Metaculus
Anxiety is growing about the new Delta variant of coronavirus. What do the prediction markets say?
So the overall scenario I’m getting from this is that coronavirus remains quite serious for the rest of this year and this winter, probably slightly less bad than last winter, but governments don’t choose to institute very strict lockdowns. Once everyone has been vaccinated or infected, it settles down into something about twice as bad as the flu.
Here’s another good example: coronavirus vaccines. The FDA still has not fully approved any coronavirus vaccine. The only reason you’re allowed to get vaccinated at all is because of a fast-track provisional approval somewhat like the one used for aducanumab. Coronavirus vaccines have probably also averted a few hundred thousand deaths.
The countries that got through COVID the best (eg South Korea and Taiwan) controlled it through test-and-trace. This allowed them to scrape by with minimal lockdown and almost no deaths. But it only worked because they started testing and tracing really quickly - almost the moment they learned that the coronavirus existed. Could the US have done equally well?
I think yes. A bunch of laboratories, universities, and health care groups came up with COVID tests before the virus was even in the US, and were 100% ready to deploy them. But when the US declared that the coronavirus was a “public health emergency”, the FDA announced that the emergency was so grave that they were banning all coronavirus testing, so that nobody could take advantage of the emergency to peddle shoddy tests. Perhaps you might feel like this is exactly the opposite of what you should do during an emergency? This is a sure sign that you will never work for the FDA.
Inline links: announced
Coria: Absolutely not. I’m only recommending the existence of governments, which has been standard practice since Gilgamesh. Many things are rights violations - for example, seizing someone’s property. But when a legitimate government does so in the public interest after due consideration, we accept it as part of living in a society. It was a rights violation to quarantine an entire population in their homes during the early days of the coronavirus. But the legitimate government decided to do it in order to protect the public interest, so it’s not morally equivalent to kidnapping or whatever we would call it if a random person did it. And some states still castrate pedophiles as a punishment - one which naturally includes sterilization - and I have no particular problem with that. So it seems I must believe governments may sometimes involuntarily sterilize citizens when it is in the public interest. Did you know the Supreme Court’s ruling on Buck said that “The principle that sustains compulsory vaccination is broad enough to cover cutting the Fallopian tubes?”
The southwest corner is where most of the wildlife was being sold. Rumor said that included a stall with raccoon-dogs, an animal which is generally teeming with weird coronaviruses, and is a plausible intermediate host between humans and bats: Awwww, come on, you can’t stay mad at this little guy. China said this rumor was false and refused to release any information. Scientists were finally able to confirm the existence of the raccoon-dog shop in the funniest possible way: a virologist had visited Wuhan in 2014, saw the awful conditions in the shop, and took a picture as an example of the kind of place that a future pandemic might start. Source: NPR. To be fair, we have only the scientist’s word that this is why he had the picture. But he definitely did have it. People say it would be a surprising coincidence if a zoonotic coronavirus pandemic just so happened to start in a city with a big coronavirus research lab, and this is true. But it would be an even more surprising coincidence if a lab-leak coronavirus pandemic just so happened to first get detected at a raccoon-dog stall in a wet market! Saar: It’s not clear that the first case was at the wet market; a certain Mr. Chen, with no connection to the market, seems to have fallen sick on December 8. An SCMP article suggested there were 92 previously-undetected cases suspicious for COVID as far back as November. And even if half of the first forty universally-agreed-upon cases had market connections that means another half didn’t. There was a bias towards detecting cases at the market: because authorities thought the market was the origin, and because everyone was thinking about zoonosis after SARS1, they only screened/diagnosed people with a market connection. One of the few non-market-connected COVID cases detected during this period was only detected because he was the relative of a hospital worker; the worker noticed the signs and insisted they go to the hospital despite the lack of a wet market connection. Although the map of positive samples and cases at the market was centered near the raccoon-dog stall, that could be because that area was sampled more; it’s also close to the mahjong room, where visitors and vendors at the market would go and unwind in a tight, poorly ventilated area. The next session will focus more on the WIV, but the short version is that they were doing lots of gain of function research. So one story compatible with the evidence is that a worker at WIV got infected with their modified coronavirus and passed it to his contacts. COVID started spreading quietly a few weeks to months before the first market-related case was detected. This accounts for the 92 earlier cases, Mr. Chen’s case, and the half of officially-detected cases with no wet market association. Then an infected person went to the market, causing a super-spreader event. Some of the infected market patrons went to the hospital, where doctors traced it back to the market and told other doctors to be on the lookout for wet market patrons coming in with weird viral pneumonias. They found some, declared victory, and the few anomalies - like the hospital worker’s relative - were forgotten, or assumed to have wet market connections that nobody could find. China quashed all evidence of the lab research (as was done in previous lab leak cases, eg the USSR) so all we have is the apparent wet market links that Peter found so convincing. Peter: The supposed pre-wet-market cases are confirmed fakes. Yes, the WHO did an investigation of whether there might have been COVID cases circulating before the wet market, and identified 92 unusual pneumonias that merited further review. But their final investigation, which included testing samples from these people after good tests became available, found that none of these people really had COVID. As for Mr. Chen, he said in an interview that he was hospitalized for dental issues on December 8, caught COVID in the hospital on December 16, and then was erroneously reported as “hospitalized for COVID on December 8”. The December 16 date is after the first wet market cases. Further, it seems epidemiologically impossible for COVID to have been circulating much before the first cases were officially detected December 11. The COVID pandemic doubles every 3.5 days. So if the first infection was much earlier - let’s say November 11 - we would expect 256x as much COVID as we actually saw. Even if the first couple of cases were missed because nobody was looking for them, the number of hospitalizations, deaths, etc, in January or whenever were all consistent with the number of people you’d expect if the pandemic started in early December - and not consistent with 256x that many people. So probably we should just accept that the first reported case - a wet market vendor, December 11 - was very early in the pandemic. She wasn’t literally the first case - that would most likely have been someone who worked at the raccoon-dog shop, whose case might (like 95% of COVID cases) have been mild enough not to come to medical attention. But she was certainly very early. Although authorities eventually decided COVID spread through a wet market and started deliberately looking for wet market connections, this only happened on December 30. So the earliest cases - including the 40 very earliest cases where half came from the wet market - weren’t biased (at least not through that particular route). So the claim that “the first case, and half of the first 40 cases, had wet market connections” stands as real and convincing evidence. Although the exact center of the map of positive COVID samples in the wet market was the mahjong room, the samples taken from the mahjong room were not, themselves, positive (cf: although a low-resolution population density map of New York might show Central Park in the exact center of the population density gradient, Central Park does not itself have population). There was no real “super-spreader event” at the wet market. There was a slow burn - one case the first day, a few more the next day, a few more the day after that. It’s hard to see how a single visit from an infected lab worker could do that. So the only way it could possibly be a lab leak is if the lab leaked sometime in late November, infected exactly one lab worker, that worker went straight to the wet market, infected a vendor, then went home, quarantined, recovered, and all other cases were downstream of that first infected wet market vendor. This is unparsimonious. Saar: The only source saying that Mr. Chen got sick early was an anonymous interview. And even if he was later than the first wet market cases, nobody was able to find any wet market connections. This means that whoever infected him was earlier than the index case and not linked to the wet market. Peter argued that COVID couldn’t have been more than a few weeks old when the first wet market cases were detected. But this was based on its known doubling rate. If pre-discovery COVID had a slower doubling time than known COVID, it could have been around longer. And post-lockdown serology suggested numbers that were larger than claimed at the time. So contra Peter’s claims, the infection could have been going on longer, which wouldn’t require the first lab worker to go straight to the market. It could have been weeks. Dr. Jesse Bloom’s investigation of the wet market samples, considered the final and most conclusive, failed to find a clear connection between COVID and raccoon-dogs or any other animals. Although the concentration of positive samples seemed highest near the raccoon dog stall, if you do a formal statistical analysis of which animals’ DNA was found near COVID samples most often, raccoon dogs are near the bottom. The top is wide-mouth bass, which can’t get COVID. This is obviously contamination, probably from infected humans touching wide-mouth bass tanks or something. Although the Chinese data included a negative sample from a mahjong table, it included a mention of poultry being sold nearby, which might mean this wasn’t the mahjong room itself, but some other mahjong table at a poultry shop elsewhere in the market, and (dry) mahjong tables might not hold the virus well anyway. Peter: Raccoon-dogs were sold in various cages at various stalls, separated by air gaps big enough to present a challenge for COVID transmission, and there’s no reason to think that one raccoon-dog would automatically pass it to all the others. The statistical analysis just proves there were many raccoon-dogs who didn’t have COVID. But you only need one. The raccoon dog shop and the drain leading out of the raccoon dog shop had some of the highest positive sample rates, which is more interesting than a statistical analysis which everyone agrees must be wrong (since it favors bass). It’s unclear why the negative mahjong sample says something about poultry, but based on the stated location, it’s definitely the one in the mahjong room. Session 1.5: Lineages This was technically part of Session 2, but formed enough of a discrete topic that I found it confusing to intermix it with all the other viral genetics points. I’m spinning it out into a separate summary, but the videos are all in the next session. Yuri: The coronavirus eventually mutated into many different strains. But the first big split, seen in some of the earliest samples, is between two different sub-strains called Lineage A and Lineage B, which differ by two mutations. In these two mutations, Lineage A is the same as BANAL-52, a bat virus which is the closest-known relative of COVID, but Lineage B is different. Since COVID probably evolved from something like BANAL-52, Lineage A must have come first, spread for a while, and then gotten two new mutations, turning it into Lineage B. All of the cases at the wet market, including the first detected case, were Lineage B. Lineage A wasn’t discovered until about a week later, and none of the Lineage A patients had been to the wet market. Lineage A (left) was used by the Minoan Cretans, but has never been deciphered. Lineage B (right) was used by the Mycaeneans for lists of palace goods. This matches Saar’s story above. The lab leaked to somewhere else in Wuhan, not the wet market. The virus spread undetected in the population for a while. During this time, it mutated to Lineage B. Then one of the people with Lineage B went to the wet market and started a superspreader event. The authorities sampled the patients, found Lineage B, then started looking elsewhere. Later they detected some of the earlier Lineage A cases. The market is unlikely to be the origin of the pandemic, because the original Lineage A strain wasn’t found there. Peter: Although Lineage A is evolutionarily older, Lineage B started spreading in humans first. We know this because Lineage B is more common. Throughout the early pandemic, until the D614G variant drove all other strains extinct, a consistent 2/3 of the cases were B, compared to 1/3 A. Both strains spread at the same rate, so the best explanation is that B started earlier than A. Since COVID doubles every 3-4 days, probably Lineage B started 3-4 days earlier than Lineage A, which explains why it’s always been twice as many cases. But also, Lineage B also has more internal genetic diversity than Lineage A. In general, older viruses have more genetic diversity (the “molecular clock”). This is further evidence that B started spreading first. Pekar 2022 and Pipes 2021 do analyses with known parameters for spread rate and diversity, and find 90%+ odds that Lineage B was the first one in humans. Why did the older strain start spreading later? Probably the virus crossed from bats into raccoon-dogs on some raccoon-dog farm out in the country. It spread in the raccoon-dogs for a while, racking up mutations, including the (less mutated) Lineage A strain and the (slightly more mutated) Lineage B strain. Then several raccoon-dogs were taken to Wuhan for sale, including one with Lineage A and another with Lineage B. The one with Lineage B passed its virus to humans earlier. Then 3-4 days later, the Lineage A one passed its virus to humans. Lineage A was first found in a Wuhan neighborhood right next to the wet market (closer to the wet market than 97% of Wuhan’s population). Again, it would be a bizarre coincidence if a lab leak pandemic was first detected at a wet market. But it would be an even more bizarre coincidence if a lab leak pandemic separated into two strains, and both were first detected at a wet market! Although no known wet market cases were Lineage A, a positive Lineage A environmental sample was found at the wet market, and everyone agrees most cases went undetected. So maybe the Lineage B raccoon-dog spread its virus to a vendor, and that sub-strain mostly stayed in the market. But the Lineage A raccoon-dog spread its virus to a customer, who went back to his house nearby, and that strain spread in the neighborhoods next to the market. This is the only story that explains the evolutionary precedence of A, the greater spread and older molecular clock of B, and the fact that both strains were first found very close to the wet market. Yuri/Saar: Lineage B could be more common and diverse because it got the advantage of a super-spreader event in the wet market. There are a few scattered cases of intermediates between A and B, and a few other scattered cases of lineages that seem even more ancestral (ie closer to the bat virus) than either. This doesn’t make sense in a double spillover hypothesis. But it does make sense if the lineages separated in human transmission somewhere between the lab and the first super-spreader event at the wet market. Peter: Again, the wet market wasn’t a super-spreader event. COVID spread in the wet market at exactly its normal spread rate, doubling about once every 3.5 days. Stop calling the wet market a super-spreader event. The scattered cases of “intermediates” are sequencing errors. They were all found by the same computer software, which “autofills” unsequenced bases in a genome to the most plausible guess. Because Lineage B was already in the software, depending on which part of a Lineage A virus you sequenced, you might get one half or the other autofilled as Lineage B, which looked like an “intermediate”. We know this because all the supposed “intermediates” were partial cases sequenced by this particular software. We can confirm this by noting that there are too many intermediates! That is, where Lineage A is (T/C) and Lineage B is (C/T), the software found both (T/T) “intermediates” and (C/C) “intermediates”. But obviously there can only be one real intermediate form, and we have to dismiss one or the other. But in fact we can dismiss both, because they were both caused by the same software bug. The scattered “progenitor” cases - those closer to the ancestral bat virus than either A or B - are reversions, ie cases where a new mutation in the virus happened to hit an already-mutated base and shift it back towards the ancestral virus. We know this because all of these “progenitors” were scattered cases found months after the pandemic started, often in entirely different countries from Wuhan. If these were real progenitor viruses, they would have either fizzled out or exploded into a substantial portion of all cases, not be found one time in one guy in Malaysia. Given the number of mutations the virus developed over the course of the pandemic, it’s inevitable that some of them would be mutations that bring it closer to the original bat virus, and in fact we find the number of “progenitors” found very nicely matches the number of progenitor-appearing viruses we would expect by chance. And in many cases, we know the “progenitors” are newer than the original lineages, because they also have some of the later mutations that Lineage A or B picked up along the way, alongside their apparent ancestral-bat-virus-like mutations. Session 2: Viral Genetics Yuri: Two years before COVID, scientists at the Wuhan Institute of Virology, together with colleagues at the University of North Carolina, sent in a grant proposal for the DEFUSE program. This program, intended to locate and better understand potential future pandemic viruses, involved going into bat caves and collecting new coronaviruses. Once they had them, they would do gain-of-function: specifically, they would add a furin cleavage site to make them more infectious and see what happened. (quick interlude: COVID’s spike protein has two sections: one binds to human cells through the ACE2 receptor, the other helps fuse with the cell after binding. In order to avoid the immune system, it hides both of these into one spike. But when it reaches a cell, it needs to separate them again. It takes advantage of a human respiratory enzyme, furin, to do the separation - this also ensures that it only infects its primary target, human respiratory cells. The part of COVID that lets it get separated by furin is called the “furin cleavage site”. COVID’s bat-virus ancestors were gastrointestinal viruses; the addition of a furin cleavage site was what made them respiratory viruses.) We’ve found two close relatives of COVID: bat viruses called RATG-13 and BANAL-52. In particular, COVID looks more or less like BANAL-52 plus a furin cleavage site. There are 1500 sarbecoviruses, members of the family of viruses that includes SARS and SARS2/COVID. None of them except COVID have furin cleavage sites. BANAL-52, COVID’s closest ancestor, doesn’t even have anything resembling one that could mutate into a functional furin cleavage site like COVID’s. Instead, COVID - which mostly just resembles BANAL-52 with a few scattered single-point mutations - has twelve completely new nucleotides in a row - a fully formed furin cleavage site that came out of nowhere. There is nowhere else in the genome that COVID differs from BANAL-52 in such a profound way. It’s just BANAL-52 plus a little bit of random mutation plus a fully-formed furin cleavage site that came out of nowhere. Further, the furin cleavage site is weird. It uses the protein arginine twice. But instead of the nucleotides coding for arginine in the usual viral way, both times it uses the codons CGG - the way that higher animals code for arginine. This works fine - it’s just not how viruses do it. So the obvious conclusion is that WIV, which said in 2018 that it was going to find viruses and add furin cleavage sites to them, found a close relative of BANAL-52 and added a furin cleavage site. Since they were humans, and most familiar with the human way of encoding arginine, they added it as CGG both times. COVID seemed surprisingly optimized for infecting humans. Of fifty animals it was tested in, including the usual coronavirus intermediate hosts (pangolins, raccoon-dogs, etc), it was best at infecting human cells. Further, a virus that enters a new species will usually show a burst of mutations as it “figures out” the best way to adapt to that species’ unique biology. But COVID has had a pretty constant mutation rate in humans, from the beginning of the pandemic to the end. That suggests it was already adapted to humans. This could be because the lab screened for viruses with existing adaptations, because they passed it through humanized mice in the lab, or because it adapted in the hundreds of undetected cases that happened between the lab and detection in the wet market. Usually, research with potentially dangerous coronaviruses is done in BSL-3 or 4, ie high to very-high security. But WIV was irresponsibly doing it in BSL-2, ie medium security. The researchers weren’t even required to wear masks. In general, about 1/500 labs will leak any given pathogen they’re working on (?!). But because WIV was researching such an infectious virus in such an irresponsible way, the odds of a leak were much higher. The most likely explanation for all these facts is that WIV went ahead and did the gain-of-function research they said they were going to do (the particular DEFUSE grant proposal we know about got rejected, but it proves that Wuhan wanted to do this, and they could easily have gotten funding somewhere else, or done it out of their regular budget). They found a close relative of BANAL-52 and added a furin cleavage site as a simple twelve-nucleotide insertion, using the human method of encoding arginine that their genetic engineers were familiar with. Then it leaked, spread for a while in the general Wuhan population, and eventually made it to the wet market where it got detected. Peter: As mentioned earlier, the DEFUSE grant was rejected. Further, the grant said that the Wuhan Institute of Virology was responsible for finding the viruses, and the University of North Carolina would do all the gain-of-function research. This was a reasonable division of labor, since UNC was actually good at gain-of-function research, and WIV mostly wasn’t. They had done a few very simple gain-of-function projects before, but weren’t really set up for this particular proposal and were happy to leave it for their American colleagues. Even if WIV did try to create COVID, they couldn’t have. As Yuri said, COVID looks like BANAL-52 plus a furin cleavage site. But WIV didn’t have BANAL-52. It wasn’t discovered until after the COVID pandemic started, when scientists scoured the area for potential COVID relatives. WIV had a more distant COVID relative, RATG-13. But you can’t create COVID from RATG-13; they’re too different. You would need BANAL-52, or some as-yet-undiscovered extremely close relative. WIV had neither. Are we sure they had neither? Yes. Remember, WIV’s whole job was looking for new coronaviruses. They published lists of which ones they had found pretty regularly. They published their last list in mid-2019, just a few months before the pandemic. Although lab leak proponents claimed these lists showed weird discrepancies, this was just their inability to keep names consistent, and all the lists showed basically the same viruses (plus a few extra on the later ones, as they kept discovering more). The lists didn’t include BANAL-52 or any other suitable COVID relatives - only RATG-13, which isn’t close enough to work. Could they have been keeping their discovery of BANAL-52 secret? No. Pre-pandemic, there was nothing interesting about it; our understanding of virology wasn’t good enough to point this out as a potential pandemic candidate. WIV did its gain-of-function research openly and proudly (before the pandemic, gain-of-function wasn’t as unpopular as it is now) so it’s not like they wanted to keep it secret because they might gain-of-function it later. Their lists very clearly showed they had no virus they could create COVID from, and they had no reason to hide it if they did. COVID’s furin cleavage site is admittedly unusual. But it’s unusual in a way that looks natural rather than man-made. Labs don’t usually add furin cleavage sites through nucleotide insertions (they usually mutate what’s already there). On the other hand, viruses get weird insertions of 12+ nucleotides in nature. For example, HKU1 is another emergent Chinese coronavirus that caused a small outbreak of pneumonia in 2004. It had a 15 nucleotide insertion right next to its furin cleavage site. Later strains of COVID got further 12 - 15 nucleotide insertions. Plenty of flus have 12 to 15 nucleotide insertions compared to other earlier flu strains. Sometimes insertions happen because of a mistake in viral replication. Other times the virus gets confused between its own RNA and its host’s, and splices a bit of the host RNA into the virus. This would neatly explain why the insertion used the unusual coding CGG for arginine, which is common in animals but rare in viruses. On the other hand, it’s not that rare in viruses - COVID uses CGG for arginine about 3% of the time. And human engineers don’t necessarily use it any more than that - Peter was able to find one example of humans adding arginine to a virus, and 0 out of the 5 arginines added were CGG. COVID’s furin cleavage site is a mess. When humans are inserting furin cleavage sites into viruses for gain-of-function, the standard practice is RRKR, a very nice and simple furin cleavage site which works well. COVID uses PRRAR, a bizarre furin cleavage site which no human has ever used before, and which virologists expected to work poorly. They later found that an adjacent part of COVID’s genome twisted the protein in an unusual way that allowed PRRAR to be a viable furin cleavage site, but this discovery took a lot of computer power, and was only made after COVID became important. The Wuhan virologists supposedly doing gain-of-function research on COVID shouldn’t have known this would work. Why didn’t they just use the standard RRKR site, which would have worked better? Everyone thinks it works better! Even the virus eventually decided it worked better - sometime during the course of the pandemic, it mutated away from its weird PRRAR furin cleavage site towards a more normal form. Further, COVID’s furin cleavage site was inserted via what seems to be a frameshift mutation - it wasn’t a clean insertion of the amino acids that formed the site, it was an insertion of a sequence which changed the context of the surrounding nucleotides into the amino acids that formed the site. This is a pointless too-clever-by-half “flourish” that there would be no reason for a human engineer to do. But it’s exactly the kind of weird thing that happens in the random chance of evolution. COVID is hard to culture. If you culture it in most standard media or animals, it will quickly develop characteristic mutations. But the original Wuhan strains didn’t have these mutations. The only ways to culture it without mutations are in human airway cells, or (apparently) in live raccoon-dogs. Getting human airway cells requires a donor (ie someone who donates their body to science), and Wuhan had never done this before (it was one of the technologies only used at the superior North Carolina site). As for raccoon-dogs, it sure does seems suspicious that the virus is already suited to them. The claim that COVID is uniquely adapted to humans is false. The paper that claimed that defined how well COVID was adapted to different animals by those animals’ difference (on the relevant cell receptors) from humans. So in its methodology, humans came out #1 by default. If you don’t do that, COVID is better-adapted to many other animals. It’s not necessarily true that viruses see a burst of mutations when they enter a new host. COVID spread to deer and mink, and in neither case was there a burst of mutations. COVID has a pretty simple job of infecting respiratory cells and is already very good at it, regardless of species. In Yuri’s model, Wuhan Institute of Virology picked up a discarded grant and decided to do the gain-of-function half allotted to a different university, despite their relative inexperience. They skipped over all the SARS-like viruses they were supposed to work on, and all the standard gain-of-function model backbones, in favor of BANAL-52, a virus which would not be discovered for another two years, but which they somehow had samples of, which they had for some reason decided to keep secret despite its total lack of interestingness. Then they would have had to eschew all usual gain-of-function practices in favor of inserting a weird furin cleavage site that shouldn’t have worked according to the theory they had at the time, via a frameshift mutation. Then they would have had to culture it, a technique beyond their limited capabilities. Then it would have had to leak, and magically show up again in front of the raccoon-dog stall at a wet market. Yuri: WIV wouldn’t have needed to keep BANAL-52 “secret” in some kind of sinister way. Plenty of researchers have backlogs of work they haven’t published yet. Probably they a found BANAL relative in one of their normal sampling trips, did some preliminary studies on it, and planned to publish it later once they cleaned up their data. Everyone works like this. The part of DEFUSE saying that they would only work on viruses that were 95% similar to SARS is unclear and might mean something else. It looks more like they say they’ll start with those viruses, but also do some work on novel viruses. BANAL-52 could have been one of the novel viruses. The furin cleavage site is weird, but the researchers might have done that on purpose, to make the virus easier to keep track of, or to test different furin cleavage sites. Depending on the exact BANAL-52 relative they used, it might not even be a frameshift; there’s a particular way to spell serine that would make the insertion more natural. The claims that COVID can’t be cultured in normal media are based on speculative original research by Peter and might not hold up. Peter: WIV did most of its virus-gathering in a trip to a Yunnan cave between 2010 and 2015. All those viruses have long since been processed and added to the database. There’s no sign that they made more trips to Yunnan caves, and no reason for them to keep that secret. So the idea that they might just have some new viruses they didn’t publish doesn’t hold up. But suppose they did make more trips. Given the amount of time between the DEFUSE proposal and COVID, if they kept to their normal virus-collection rate, they would have gotten about thirty new viruses. What’s the chance that one of those was BANAL-52? There are thousands of bat viruses, and BANAL-52 is so rare that it wasn’t found until well after the pandemic started and people were looking for it very hard. So the chance that one of their 30 would be BANAL-52 is low. Also, they said in DEFUSE that they planned to go back to the same Yunnan cave. But BANAL-52 was found far away from that cave, so unless it ranged over a wide area, they probably couldn’t have found it even if they got very lucky. Session 3: Closing Arguments This third debate was supposed to be about “inference”, ie how much Bayesian evidence was provided by each of the facts given so far, and how to fit them into the Rootclaim probabilistic model. I’m going to relegate my summary of the more probabilistic half to the next section of this post, and just include the closing arguments here. Saar: Peter’s case hinges on the idea that it’s very improbable that a lab leak pandemic would first show up at a wet market. But this isn’t necessarily improbable. The Huanan Seafood Market had several factors that made it a likely location for a superspreader event. It was busy, with over 10,000 visitors a day. Many of the people there (eg the 1,000 vendors) came back daily, letting them reinfect each other. It had poor ventilation, especially in the high-positivity area near the raccoon-dog stall. It had cold wet surfaces on which the virus could survive for long periods. It was indoors, which prevented UV light from killing the virus. Given a small amount of sporadic COVID going around Wuhan, it’s not surprising for the first place it started spreading en masse to be a wet market. In fact, we have several examples of this. When China was COVID Zero, there would occasionally be small outbreaks that the authorities would have to contain. Most of these were at wet markets. For example, the big COVID outbreak in Beijing started at Xinfadi Market, their local seafood market. This couldn’t be an animal spillover, because there were no raccoon-dogs or other weird wildlife there. So it must be that wet markets are natural places for superspreader events. There are several other examples, which make up about half of the total outbreaks in Zero COVID era China, plus others in Singapore and Thailand. Since COVID clusters concentrate in wet markets even when there is no animal spillover, we should accept this as a property of the virus, and not attribute any significance to the fact that this happened in Wuhan too. Peter: About 1/10,000 citizens of Wuhan was a wet market vendor. So there’s a 1/10,000 chance that the first known COVID case should be a wet market vendor by chance alone. Weibo lists the most popular places for people to check in to their network on their phones, and the wet market was the 1600th most popular place in Wuhan, meaning that if you weight locations by busy-ness, there’s a less than 1/1600 chance that the first cases would be in the wet market. Yes, the wet market is indoors, has mediocre ventilation, has repeat visitors, etc. So do thousands of other places in Wuhan, like schools, hospitals, workplaces, places of worship. The wet market isn’t special in any way. And again, it wasn’t a superspreader event! COVID spread at the same rate in the wet market as it does everywhere else: doubling once per 3.5 days. It doesn’t matter what kinds of arguments you can come up with for why the wet market should have been the perfect superspreader event location, we can look at it and see that it wasn’t. It’s an environment that spreads COVID at exactly the normal rate. Zero COVID era Chinese outbreaks were concentrated in wet markets because they received infected animal products. We know why there was an outbreak in the Xinfadi Market in Beijing: it was because the seafood stall got frozen fish from some non-Zero-COVID country, the fish had COVID particles on it, and the vendor got infected and spread it to everyone else. Something like this is true for the other Chinese wet market based outbreaks we know about it. So this makes the opposite point you think it does: wet markets start outbreaks because there are infected goods being sold there. Then the virus spreads through the wet market at a completely normal rate. Saar: The Weibo list of 1600 places bigger than the wet market is likely inaccurate, because it's based on check-in data and people don't check in to seafood markets. Most of those 1600 places aren't amenable to superspread. The 70 markets supposedly bigger than Huanan are irrelevant, because they're supermarkets, open air markets, etc. Huanan is the largest seafood market in central China, and a more likely place for the first cluster of cases to be noticed. Markets weren't a common spillover location in SARS1, so the zoonosis hypothesis hasn't "called" this event in a way that should give them a high Bayes factor. And there’s still plenty of evidence for isolated (though not super-spreading) pre-market cases. A British expatriate in Wuhan, Connor Reed, says he got sick in November, three weeks before the first wet market case. Later the hospital tested his samples and said it was COVID. Another paper reports 90 cases before the first wet market one. Peter: Connor Reed was lying. The case wasn’t reported in any peer-reviewed paper. It was reported in the tabloid The Daily Mail, months after it supposedly happened. He also told the Mail that his cat died of coronavirus too, which is rare-to-impossible. Also, to get a positive hospital test, he would have had to go to the hospital, but he was 25 years old and almost no 25-year-olds go to the hospital for coronavirus. His only evidence that it was COVID was that two months later, the hospital supposedly “notified” him that it was. The hospital never informed anyone else of this extremely surprising fact which would be the biggest scientific story of the year if true. So probably he was lying. Incidentally, he died of a drug overdose shortly after giving the Mail that story; while not all drug addicts are liars, given all the other implausibilities in his story, this certainly doesn’t make him seem more credible. And in any case, he claimed he got his case at a market “like in the media” The other 90 cases are also fake. A lab leak guy found a paper that mentioned 90 more cases than other papers, and made up a conspiracy theory where the author was trying to secretly communicate that there had been 90 secret cases before any of the confirmed cases, even though there was nothing about this in the text of the paper. But actually that paper just counted cases differently than other papers, and they were referring to normal cases after the pandemic officially started. Again, I’ll come back to the discussion about inference later, but for now, here’s a table of both sides’ reasoning. This exact presentation comparing both analyses is mine3, but you can see Saar’s version here, and Peter’s starting at 45:33 of this video. Slightly made up; the two sides didn’t express their probabilities in the same way and I had to make editorial decisions to match them. Note that these aren't entirely comparable because Peter is being laxer about out-of-model probability than Saar. Although Saar's final odds here are 533-to-1, this just the central estimate. Rootclaim’s real final probability is 94% lab leak. You can see their analysis here. And The Winner Is . . . … … … … … Peter and the zoonosis hypothesis. This was a decisive victory. There were two judges, who each gave separate verdicts (or were allowed to declare a draw). Both judges decided in favor of Peter. You can see the judges’ own summary of their reasoning here (Will, Eric) Manifold agreed with the judges. There was a prediction market on who would win. It started out 70-30 in favor of lab leak. As the videos came out, zoonosis started doing better and better. I don’t want to take the exact final numbers too seriously, since I think some of the later price increases involved hints from the participants’ behavior. But it’s clear which way viewers thought the wind was blowing4. Around the same time, the Good Judgment Project - Philip Tetlock’s group studying superforecasters - put out a report on the lab leak hypothesis. After studying it in depth, his forecasters ended up 75-25 in favor of zoonosis. The Rootclaim debate was one of ten sources they said they found especially interesting. And also around the same time, and unrelated to any of this, the Global Catastrophic Risks Institute surveyed experts (“168 virologists, infectious disease epidemiologists, and other scientists from 47 countries”) and found the same thing (though see here for some potential problems with the survey): For what it’s worth, I was close to 50-50 before the debate, and now I’m 90-10 in favor of zoonosis. III. The Math And The Aftermath The third debate session was about “inference”, how to put evidence together. I put this part off until after disclosing the winner, because I wanted to talk about some of these issues at more length. The Math: Judges Both judges included a probabilistic analysis in their written decision. Here’s the same table as above, expanded to add the judges: I shoehorned the judges’ factors into the categories I already had; some of them were actually subtly different from Peter’s, Saar’s, and each other’s. The “priors” category is especially a mess here. We’ll go over these later, but I get the impression that they both thought of probabilistic analyses as an afterthought. For example, Judge Eric wrote 30,000 words about which considerations moved him, and only then includes the analysis, saying: I am not convinced that this Bayesian calculation is even an appropriate way to estimate the relative posterior probability of Z and LL; it just seemed fair that after criticizing Rootclaim’s calculations at length I should make an attempt at it myself. Judge Will’s decision ran to 10,000 words. He said he independently tried both reasoning it out intuitively, and running the Bayesian analysis, and was relieved when these two methods returned the same result. He said: I am skeptical that the Bayesian decision making/evaluation methods are any more "objective" than [intuitive reasoning]. I think they maximize legibility, not objectivity, and tend to hide the intuitive/heuristic portion in the data inclusion step and values, where it’s harder to see . . . I am not skilled in the Bayesian method, and I am sure I made significant mistakes. More time and practice would improve and refine my estimates. At the fundamental rules of the universe level, Bayesian analysis must be the best way to evaluate evidence. However, I am unsure that it’s a good strategy for a human given our cognitive limitations, and doubly unsure it’s truly being used (in the dispassionate sense) where the outcome is social desirability/fame/Twitter likes. I’m focusing on this because Saar’s opinion is that the debate went wrong (for his side) because he didn’t realize the judges were going to use Bayesian math, they did the math wrong (because Saar hadn’t done enough work explaining how to do it right), and so they got the wrong answer. I want to discuss the math errors he thinks the judges made, but this discussion would be incomplete without mentioning that the judges themselves say the numbers were only a supplement for their intuitive reasoning. That having been said, let’s look deeper into some of Saar’s concerns. The Math: Extreme Odds Saar complained that Peter’s odds were too extreme. For example, Peter said there was only a 1/10,000 chance that a lab leak pandemic would first show up at a wet market. Peter’s argument went something like: obviously a zoonotic pandemic would start at a site selling weird animals. But a lab leak pandemic - if it didn’t start at the lab - could show up anywhere. 1/10,000 Wuhan citizens work at the wet market. So if a lab leak was going to show up somewhere random, the wet market was a 1/10,000 chance. Saar had specific arguments against this, but he also had a more general argument: you should rarely see odds like 1/10,000 outside of well-understood domains. In his blog post, he gave this example: A prosecutor shows the court a statistical analysis of which DNA markers matched the defendant and their prevalence, arriving at a 1E-9 probability they would all match a random person, implying a Bayes factor near 1E9 for guilty. But if we try to estimate p(DNA|~guilty) by truly assuming innocence, it is immediately evident how ridiculous it is to claim only 1 out of a billion innocent suspects will have a DNA match to the crime scene. There are obviously far better explanations like a lab mistake, framing, an object of the suspect being brought by someone to the scene, etc. So the real p(wet market|lab leak) isn’t the 1/10,000 chance a pandemic arising in a random place hits the wet market, but the (higher?) probability that there’s something wrong with Peter’s argument. Then Saar tried to show specific things that might be wrong with Peter’s argument. I didn’t find his specific examples convincing. But maybe the question shouldn’t be whether I agreed with him. It should be whether I’m so confident he’s wrong that I would give it 10,000-to-1 odds. This makes total sense, it’s absolutely true, and I want to be really, really careful with it. If you take this kind of reasoning too far, you can convince yourself that the sun won’t rise tomorrow morning. All you have to do is propose 100 different reasons the sunrise might not happen. For example: The sun might go nova.
Inline links: https://substackcdn.com/image/fetch/$s_!M2v4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F148d0da1-bf00-47cb-bc25-9035280588e7_1280x960.jpeg, https://substackcdn.com/image/fetch/$s_!xaM6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa05681e0-d6d4-42e3-aa4c-a5c6620dd5e5_617x405.png, NPR, found that, Mr. Chen, failed to find a clear connection between COVID and raccoon-dogs, https://substackcdn.com/image/fetch/$s_!8Led!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e46d3a-c19e-407b-88b7-1d8ea1489df7_1190x503.png, Pekar 2022, Pipes 2021, says, 3, here, this video, https://substackcdn.com/image/fetch/$s_!8aU2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815fe32d-d7ea-401b-b3a2-d8cd25b52ee8_490x780.png, https://substackcdn.com/image/fetch/$s_!0Tm_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0492f69-7b7e-4611-9d76-64ef8d7f59d5_511x511.png, Will, Eric, agreed, 4, put out a report on the lab leak hypothesis, https://substackcdn.com/image/fetch/$s_!g7k2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37f1b493-b556-41ec-925e-03f9d8bc26cb_1456x849.webp, surveyed experts, see here, https://substackcdn.com/image/fetch/$s_!Zejl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9c88e87-b6ca-4c6d-840e-24da726f50b7_975x365.png, https://substackcdn.com/image/fetch/$s_!T5rV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4983e2cd-4151-42de-9685-08037ef7a8e8_635x788.png
Source: NPR. To be fair, we have only the scientist’s word that this is why he had the picture. But he definitely did have it. People say it would be a surprising coincidence if a zoonotic coronavirus pandemic just so happened to start in a city with a big coronavirus research lab, and this is true. But it would be an even more surprising coincidence if a lab-leak coronavirus pandemic just so happened to first get detected at a raccoon-dog stall in a wet market! Saar: It’s not clear that the first case was at the wet market; a certain Mr. Chen, with no connection to the market, seems to have fallen sick on December 8. An SCMP article suggested there were 92 previously-undetected cases suspicious for COVID as far back as November. And even if half of the first forty universally-agreed-upon cases had market connections that means another half didn’t. There was a bias towards detecting cases at the market: because authorities thought the market was the origin, and because everyone was thinking about zoonosis after SARS1, they only screened/diagnosed people with a market connection. One of the few non-market-connected COVID cases detected during this period was only detected because he was the relative of a hospital worker; the worker noticed the signs and insisted they go to the hospital despite the lack of a wet market connection. Although the map of positive samples and cases at the market was centered near the raccoon-dog stall, that could be because that area was sampled more; it’s also close to the mahjong room, where visitors and vendors at the market would go and unwind in a tight, poorly ventilated area. The next session will focus more on the WIV, but the short version is that they were doing lots of gain of function research. So one story compatible with the evidence is that a worker at WIV got infected with their modified coronavirus and passed it to his contacts. COVID started spreading quietly a few weeks to months before the first market-related case was detected. This accounts for the 92 earlier cases, Mr. Chen’s case, and the half of officially-detected cases with no wet market association. Then an infected person went to the market, causing a super-spreader event. Some of the infected market patrons went to the hospital, where doctors traced it back to the market and told other doctors to be on the lookout for wet market patrons coming in with weird viral pneumonias. They found some, declared victory, and the few anomalies - like the hospital worker’s relative - were forgotten, or assumed to have wet market connections that nobody could find. China quashed all evidence of the lab research (as was done in previous lab leak cases, eg the USSR) so all we have is the apparent wet market links that Peter found so convincing. Peter: The supposed pre-wet-market cases are confirmed fakes. Yes, the WHO did an investigation of whether there might have been COVID cases circulating before the wet market, and identified 92 unusual pneumonias that merited further review. But their final investigation, which included testing samples from these people after good tests became available, found that none of these people really had COVID. As for Mr. Chen, he said in an interview that he was hospitalized for dental issues on December 8, caught COVID in the hospital on December 16, and then was erroneously reported as “hospitalized for COVID on December 8”. The December 16 date is after the first wet market cases. Further, it seems epidemiologically impossible for COVID to have been circulating much before the first cases were officially detected December 11. The COVID pandemic doubles every 3.5 days. So if the first infection was much earlier - let’s say November 11 - we would expect 256x as much COVID as we actually saw. Even if the first couple of cases were missed because nobody was looking for them, the number of hospitalizations, deaths, etc, in January or whenever were all consistent with the number of people you’d expect if the pandemic started in early December - and not consistent with 256x that many people. So probably we should just accept that the first reported case - a wet market vendor, December 11 - was very early in the pandemic. She wasn’t literally the first case - that would most likely have been someone who worked at the raccoon-dog shop, whose case might (like 95% of COVID cases) have been mild enough not to come to medical attention. But she was certainly very early. Although authorities eventually decided COVID spread through a wet market and started deliberately looking for wet market connections, this only happened on December 30. So the earliest cases - including the 40 very earliest cases where half came from the wet market - weren’t biased (at least not through that particular route). So the claim that “the first case, and half of the first 40 cases, had wet market connections” stands as real and convincing evidence. Although the exact center of the map of positive COVID samples in the wet market was the mahjong room, the samples taken from the mahjong room were not, themselves, positive (cf: although a low-resolution population density map of New York might show Central Park in the exact center of the population density gradient, Central Park does not itself have population). There was no real “super-spreader event” at the wet market. There was a slow burn - one case the first day, a few more the next day, a few more the day after that. It’s hard to see how a single visit from an infected lab worker could do that. So the only way it could possibly be a lab leak is if the lab leaked sometime in late November, infected exactly one lab worker, that worker went straight to the wet market, infected a vendor, then went home, quarantined, recovered, and all other cases were downstream of that first infected wet market vendor. This is unparsimonious. Saar: The only source saying that Mr. Chen got sick early was an anonymous interview. And even if he was later than the first wet market cases, nobody was able to find any wet market connections. This means that whoever infected him was earlier than the index case and not linked to the wet market. Peter argued that COVID couldn’t have been more than a few weeks old when the first wet market cases were detected. But this was based on its known doubling rate. If pre-discovery COVID had a slower doubling time than known COVID, it could have been around longer. And post-lockdown serology suggested numbers that were larger than claimed at the time. So contra Peter’s claims, the infection could have been going on longer, which wouldn’t require the first lab worker to go straight to the market. It could have been weeks. Dr. Jesse Bloom’s investigation of the wet market samples, considered the final and most conclusive, failed to find a clear connection between COVID and raccoon-dogs or any other animals. Although the concentration of positive samples seemed highest near the raccoon dog stall, if you do a formal statistical analysis of which animals’ DNA was found near COVID samples most often, raccoon dogs are near the bottom. The top is wide-mouth bass, which can’t get COVID. This is obviously contamination, probably from infected humans touching wide-mouth bass tanks or something. Although the Chinese data included a negative sample from a mahjong table, it included a mention of poultry being sold nearby, which might mean this wasn’t the mahjong room itself, but some other mahjong table at a poultry shop elsewhere in the market, and (dry) mahjong tables might not hold the virus well anyway. Peter: Raccoon-dogs were sold in various cages at various stalls, separated by air gaps big enough to present a challenge for COVID transmission, and there’s no reason to think that one raccoon-dog would automatically pass it to all the others. The statistical analysis just proves there were many raccoon-dogs who didn’t have COVID. But you only need one. The raccoon dog shop and the drain leading out of the raccoon dog shop had some of the highest positive sample rates, which is more interesting than a statistical analysis which everyone agrees must be wrong (since it favors bass). It’s unclear why the negative mahjong sample says something about poultry, but based on the stated location, it’s definitely the one in the mahjong room. Session 1.5: Lineages This was technically part of Session 2, but formed enough of a discrete topic that I found it confusing to intermix it with all the other viral genetics points. I’m spinning it out into a separate summary, but the videos are all in the next session. Yuri: The coronavirus eventually mutated into many different strains. But the first big split, seen in some of the earliest samples, is between two different sub-strains called Lineage A and Lineage B, which differ by two mutations. In these two mutations, Lineage A is the same as BANAL-52, a bat virus which is the closest-known relative of COVID, but Lineage B is different. Since COVID probably evolved from something like BANAL-52, Lineage A must have come first, spread for a while, and then gotten two new mutations, turning it into Lineage B. All of the cases at the wet market, including the first detected case, were Lineage B. Lineage A wasn’t discovered until about a week later, and none of the Lineage A patients had been to the wet market. Lineage A (left) was used by the Minoan Cretans, but has never been deciphered. Lineage B (right) was used by the Mycaeneans for lists of palace goods. This matches Saar’s story above. The lab leaked to somewhere else in Wuhan, not the wet market. The virus spread undetected in the population for a while. During this time, it mutated to Lineage B. Then one of the people with Lineage B went to the wet market and started a superspreader event. The authorities sampled the patients, found Lineage B, then started looking elsewhere. Later they detected some of the earlier Lineage A cases. The market is unlikely to be the origin of the pandemic, because the original Lineage A strain wasn’t found there. Peter: Although Lineage A is evolutionarily older, Lineage B started spreading in humans first. We know this because Lineage B is more common. Throughout the early pandemic, until the D614G variant drove all other strains extinct, a consistent 2/3 of the cases were B, compared to 1/3 A. Both strains spread at the same rate, so the best explanation is that B started earlier than A. Since COVID doubles every 3-4 days, probably Lineage B started 3-4 days earlier than Lineage A, which explains why it’s always been twice as many cases. But also, Lineage B also has more internal genetic diversity than Lineage A. In general, older viruses have more genetic diversity (the “molecular clock”). This is further evidence that B started spreading first. Pekar 2022 and Pipes 2021 do analyses with known parameters for spread rate and diversity, and find 90%+ odds that Lineage B was the first one in humans. Why did the older strain start spreading later? Probably the virus crossed from bats into raccoon-dogs on some raccoon-dog farm out in the country. It spread in the raccoon-dogs for a while, racking up mutations, including the (less mutated) Lineage A strain and the (slightly more mutated) Lineage B strain. Then several raccoon-dogs were taken to Wuhan for sale, including one with Lineage A and another with Lineage B. The one with Lineage B passed its virus to humans earlier. Then 3-4 days later, the Lineage A one passed its virus to humans. Lineage A was first found in a Wuhan neighborhood right next to the wet market (closer to the wet market than 97% of Wuhan’s population). Again, it would be a bizarre coincidence if a lab leak pandemic was first detected at a wet market. But it would be an even more bizarre coincidence if a lab leak pandemic separated into two strains, and both were first detected at a wet market! Although no known wet market cases were Lineage A, a positive Lineage A environmental sample was found at the wet market, and everyone agrees most cases went undetected. So maybe the Lineage B raccoon-dog spread its virus to a vendor, and that sub-strain mostly stayed in the market. But the Lineage A raccoon-dog spread its virus to a customer, who went back to his house nearby, and that strain spread in the neighborhoods next to the market. This is the only story that explains the evolutionary precedence of A, the greater spread and older molecular clock of B, and the fact that both strains were first found very close to the wet market. Yuri/Saar: Lineage B could be more common and diverse because it got the advantage of a super-spreader event in the wet market. There are a few scattered cases of intermediates between A and B, and a few other scattered cases of lineages that seem even more ancestral (ie closer to the bat virus) than either. This doesn’t make sense in a double spillover hypothesis. But it does make sense if the lineages separated in human transmission somewhere between the lab and the first super-spreader event at the wet market. Peter: Again, the wet market wasn’t a super-spreader event. COVID spread in the wet market at exactly its normal spread rate, doubling about once every 3.5 days. Stop calling the wet market a super-spreader event. The scattered cases of “intermediates” are sequencing errors. They were all found by the same computer software, which “autofills” unsequenced bases in a genome to the most plausible guess. Because Lineage B was already in the software, depending on which part of a Lineage A virus you sequenced, you might get one half or the other autofilled as Lineage B, which looked like an “intermediate”. We know this because all the supposed “intermediates” were partial cases sequenced by this particular software. We can confirm this by noting that there are too many intermediates! That is, where Lineage A is (T/C) and Lineage B is (C/T), the software found both (T/T) “intermediates” and (C/C) “intermediates”. But obviously there can only be one real intermediate form, and we have to dismiss one or the other. But in fact we can dismiss both, because they were both caused by the same software bug. The scattered “progenitor” cases - those closer to the ancestral bat virus than either A or B - are reversions, ie cases where a new mutation in the virus happened to hit an already-mutated base and shift it back towards the ancestral virus. We know this because all of these “progenitors” were scattered cases found months after the pandemic started, often in entirely different countries from Wuhan. If these were real progenitor viruses, they would have either fizzled out or exploded into a substantial portion of all cases, not be found one time in one guy in Malaysia. Given the number of mutations the virus developed over the course of the pandemic, it’s inevitable that some of them would be mutations that bring it closer to the original bat virus, and in fact we find the number of “progenitors” found very nicely matches the number of progenitor-appearing viruses we would expect by chance. And in many cases, we know the “progenitors” are newer than the original lineages, because they also have some of the later mutations that Lineage A or B picked up along the way, alongside their apparent ancestral-bat-virus-like mutations. Session 2: Viral Genetics Yuri: Two years before COVID, scientists at the Wuhan Institute of Virology, together with colleagues at the University of North Carolina, sent in a grant proposal for the DEFUSE program. This program, intended to locate and better understand potential future pandemic viruses, involved going into bat caves and collecting new coronaviruses. Once they had them, they would do gain-of-function: specifically, they would add a furin cleavage site to make them more infectious and see what happened. (quick interlude: COVID’s spike protein has two sections: one binds to human cells through the ACE2 receptor, the other helps fuse with the cell after binding. In order to avoid the immune system, it hides both of these into one spike. But when it reaches a cell, it needs to separate them again. It takes advantage of a human respiratory enzyme, furin, to do the separation - this also ensures that it only infects its primary target, human respiratory cells. The part of COVID that lets it get separated by furin is called the “furin cleavage site”. COVID’s bat-virus ancestors were gastrointestinal viruses; the addition of a furin cleavage site was what made them respiratory viruses.) We’ve found two close relatives of COVID: bat viruses called RATG-13 and BANAL-52. In particular, COVID looks more or less like BANAL-52 plus a furin cleavage site. There are 1500 sarbecoviruses, members of the family of viruses that includes SARS and SARS2/COVID. None of them except COVID have furin cleavage sites. BANAL-52, COVID’s closest ancestor, doesn’t even have anything resembling one that could mutate into a functional furin cleavage site like COVID’s. Instead, COVID - which mostly just resembles BANAL-52 with a few scattered single-point mutations - has twelve completely new nucleotides in a row - a fully formed furin cleavage site that came out of nowhere. There is nowhere else in the genome that COVID differs from BANAL-52 in such a profound way. It’s just BANAL-52 plus a little bit of random mutation plus a fully-formed furin cleavage site that came out of nowhere. Further, the furin cleavage site is weird. It uses the protein arginine twice. But instead of the nucleotides coding for arginine in the usual viral way, both times it uses the codons CGG - the way that higher animals code for arginine. This works fine - it’s just not how viruses do it. So the obvious conclusion is that WIV, which said in 2018 that it was going to find viruses and add furin cleavage sites to them, found a close relative of BANAL-52 and added a furin cleavage site. Since they were humans, and most familiar with the human way of encoding arginine, they added it as CGG both times. COVID seemed surprisingly optimized for infecting humans. Of fifty animals it was tested in, including the usual coronavirus intermediate hosts (pangolins, raccoon-dogs, etc), it was best at infecting human cells. Further, a virus that enters a new species will usually show a burst of mutations as it “figures out” the best way to adapt to that species’ unique biology. But COVID has had a pretty constant mutation rate in humans, from the beginning of the pandemic to the end. That suggests it was already adapted to humans. This could be because the lab screened for viruses with existing adaptations, because they passed it through humanized mice in the lab, or because it adapted in the hundreds of undetected cases that happened between the lab and detection in the wet market. Usually, research with potentially dangerous coronaviruses is done in BSL-3 or 4, ie high to very-high security. But WIV was irresponsibly doing it in BSL-2, ie medium security. The researchers weren’t even required to wear masks. In general, about 1/500 labs will leak any given pathogen they’re working on (?!). But because WIV was researching such an infectious virus in such an irresponsible way, the odds of a leak were much higher. The most likely explanation for all these facts is that WIV went ahead and did the gain-of-function research they said they were going to do (the particular DEFUSE grant proposal we know about got rejected, but it proves that Wuhan wanted to do this, and they could easily have gotten funding somewhere else, or done it out of their regular budget). They found a close relative of BANAL-52 and added a furin cleavage site as a simple twelve-nucleotide insertion, using the human method of encoding arginine that their genetic engineers were familiar with. Then it leaked, spread for a while in the general Wuhan population, and eventually made it to the wet market where it got detected. Peter: As mentioned earlier, the DEFUSE grant was rejected. Further, the grant said that the Wuhan Institute of Virology was responsible for finding the viruses, and the University of North Carolina would do all the gain-of-function research. This was a reasonable division of labor, since UNC was actually good at gain-of-function research, and WIV mostly wasn’t. They had done a few very simple gain-of-function projects before, but weren’t really set up for this particular proposal and were happy to leave it for their American colleagues. Even if WIV did try to create COVID, they couldn’t have. As Yuri said, COVID looks like BANAL-52 plus a furin cleavage site. But WIV didn’t have BANAL-52. It wasn’t discovered until after the COVID pandemic started, when scientists scoured the area for potential COVID relatives. WIV had a more distant COVID relative, RATG-13. But you can’t create COVID from RATG-13; they’re too different. You would need BANAL-52, or some as-yet-undiscovered extremely close relative. WIV had neither. Are we sure they had neither? Yes. Remember, WIV’s whole job was looking for new coronaviruses. They published lists of which ones they had found pretty regularly. They published their last list in mid-2019, just a few months before the pandemic. Although lab leak proponents claimed these lists showed weird discrepancies, this was just their inability to keep names consistent, and all the lists showed basically the same viruses (plus a few extra on the later ones, as they kept discovering more). The lists didn’t include BANAL-52 or any other suitable COVID relatives - only RATG-13, which isn’t close enough to work. Could they have been keeping their discovery of BANAL-52 secret? No. Pre-pandemic, there was nothing interesting about it; our understanding of virology wasn’t good enough to point this out as a potential pandemic candidate. WIV did its gain-of-function research openly and proudly (before the pandemic, gain-of-function wasn’t as unpopular as it is now) so it’s not like they wanted to keep it secret because they might gain-of-function it later. Their lists very clearly showed they had no virus they could create COVID from, and they had no reason to hide it if they did. COVID’s furin cleavage site is admittedly unusual. But it’s unusual in a way that looks natural rather than man-made. Labs don’t usually add furin cleavage sites through nucleotide insertions (they usually mutate what’s already there). On the other hand, viruses get weird insertions of 12+ nucleotides in nature. For example, HKU1 is another emergent Chinese coronavirus that caused a small outbreak of pneumonia in 2004. It had a 15 nucleotide insertion right next to its furin cleavage site. Later strains of COVID got further 12 - 15 nucleotide insertions. Plenty of flus have 12 to 15 nucleotide insertions compared to other earlier flu strains. Sometimes insertions happen because of a mistake in viral replication. Other times the virus gets confused between its own RNA and its host’s, and splices a bit of the host RNA into the virus. This would neatly explain why the insertion used the unusual coding CGG for arginine, which is common in animals but rare in viruses. On the other hand, it’s not that rare in viruses - COVID uses CGG for arginine about 3% of the time. And human engineers don’t necessarily use it any more than that - Peter was able to find one example of humans adding arginine to a virus, and 0 out of the 5 arginines added were CGG. COVID’s furin cleavage site is a mess. When humans are inserting furin cleavage sites into viruses for gain-of-function, the standard practice is RRKR, a very nice and simple furin cleavage site which works well. COVID uses PRRAR, a bizarre furin cleavage site which no human has ever used before, and which virologists expected to work poorly. They later found that an adjacent part of COVID’s genome twisted the protein in an unusual way that allowed PRRAR to be a viable furin cleavage site, but this discovery took a lot of computer power, and was only made after COVID became important. The Wuhan virologists supposedly doing gain-of-function research on COVID shouldn’t have known this would work. Why didn’t they just use the standard RRKR site, which would have worked better? Everyone thinks it works better! Even the virus eventually decided it worked better - sometime during the course of the pandemic, it mutated away from its weird PRRAR furin cleavage site towards a more normal form. Further, COVID’s furin cleavage site was inserted via what seems to be a frameshift mutation - it wasn’t a clean insertion of the amino acids that formed the site, it was an insertion of a sequence which changed the context of the surrounding nucleotides into the amino acids that formed the site. This is a pointless too-clever-by-half “flourish” that there would be no reason for a human engineer to do. But it’s exactly the kind of weird thing that happens in the random chance of evolution. COVID is hard to culture. If you culture it in most standard media or animals, it will quickly develop characteristic mutations. But the original Wuhan strains didn’t have these mutations. The only ways to culture it without mutations are in human airway cells, or (apparently) in live raccoon-dogs. Getting human airway cells requires a donor (ie someone who donates their body to science), and Wuhan had never done this before (it was one of the technologies only used at the superior North Carolina site). As for raccoon-dogs, it sure does seems suspicious that the virus is already suited to them. The claim that COVID is uniquely adapted to humans is false. The paper that claimed that defined how well COVID was adapted to different animals by those animals’ difference (on the relevant cell receptors) from humans. So in its methodology, humans came out #1 by default. If you don’t do that, COVID is better-adapted to many other animals. It’s not necessarily true that viruses see a burst of mutations when they enter a new host. COVID spread to deer and mink, and in neither case was there a burst of mutations. COVID has a pretty simple job of infecting respiratory cells and is already very good at it, regardless of species. In Yuri’s model, Wuhan Institute of Virology picked up a discarded grant and decided to do the gain-of-function half allotted to a different university, despite their relative inexperience. They skipped over all the SARS-like viruses they were supposed to work on, and all the standard gain-of-function model backbones, in favor of BANAL-52, a virus which would not be discovered for another two years, but which they somehow had samples of, which they had for some reason decided to keep secret despite its total lack of interestingness. Then they would have had to eschew all usual gain-of-function practices in favor of inserting a weird furin cleavage site that shouldn’t have worked according to the theory they had at the time, via a frameshift mutation. Then they would have had to culture it, a technique beyond their limited capabilities. Then it would have had to leak, and magically show up again in front of the raccoon-dog stall at a wet market. Yuri: WIV wouldn’t have needed to keep BANAL-52 “secret” in some kind of sinister way. Plenty of researchers have backlogs of work they haven’t published yet. Probably they a found BANAL relative in one of their normal sampling trips, did some preliminary studies on it, and planned to publish it later once they cleaned up their data. Everyone works like this. The part of DEFUSE saying that they would only work on viruses that were 95% similar to SARS is unclear and might mean something else. It looks more like they say they’ll start with those viruses, but also do some work on novel viruses. BANAL-52 could have been one of the novel viruses. The furin cleavage site is weird, but the researchers might have done that on purpose, to make the virus easier to keep track of, or to test different furin cleavage sites. Depending on the exact BANAL-52 relative they used, it might not even be a frameshift; there’s a particular way to spell serine that would make the insertion more natural. The claims that COVID can’t be cultured in normal media are based on speculative original research by Peter and might not hold up. Peter: WIV did most of its virus-gathering in a trip to a Yunnan cave between 2010 and 2015. All those viruses have long since been processed and added to the database. There’s no sign that they made more trips to Yunnan caves, and no reason for them to keep that secret. So the idea that they might just have some new viruses they didn’t publish doesn’t hold up. But suppose they did make more trips. Given the amount of time between the DEFUSE proposal and COVID, if they kept to their normal virus-collection rate, they would have gotten about thirty new viruses. What’s the chance that one of those was BANAL-52? There are thousands of bat viruses, and BANAL-52 is so rare that it wasn’t found until well after the pandemic started and people were looking for it very hard. So the chance that one of their 30 would be BANAL-52 is low. Also, they said in DEFUSE that they planned to go back to the same Yunnan cave. But BANAL-52 was found far away from that cave, so unless it ranged over a wide area, they probably couldn’t have found it even if they got very lucky. Session 3: Closing Arguments This third debate was supposed to be about “inference”, ie how much Bayesian evidence was provided by each of the facts given so far, and how to fit them into the Rootclaim probabilistic model. I’m going to relegate my summary of the more probabilistic half to the next section of this post, and just include the closing arguments here. Saar: Peter’s case hinges on the idea that it’s very improbable that a lab leak pandemic would first show up at a wet market. But this isn’t necessarily improbable. The Huanan Seafood Market had several factors that made it a likely location for a superspreader event. It was busy, with over 10,000 visitors a day. Many of the people there (eg the 1,000 vendors) came back daily, letting them reinfect each other. It had poor ventilation, especially in the high-positivity area near the raccoon-dog stall. It had cold wet surfaces on which the virus could survive for long periods. It was indoors, which prevented UV light from killing the virus. Given a small amount of sporadic COVID going around Wuhan, it’s not surprising for the first place it started spreading en masse to be a wet market. In fact, we have several examples of this. When China was COVID Zero, there would occasionally be small outbreaks that the authorities would have to contain. Most of these were at wet markets. For example, the big COVID outbreak in Beijing started at Xinfadi Market, their local seafood market. This couldn’t be an animal spillover, because there were no raccoon-dogs or other weird wildlife there. So it must be that wet markets are natural places for superspreader events. There are several other examples, which make up about half of the total outbreaks in Zero COVID era China, plus others in Singapore and Thailand. Since COVID clusters concentrate in wet markets even when there is no animal spillover, we should accept this as a property of the virus, and not attribute any significance to the fact that this happened in Wuhan too. Peter: About 1/10,000 citizens of Wuhan was a wet market vendor. So there’s a 1/10,000 chance that the first known COVID case should be a wet market vendor by chance alone. Weibo lists the most popular places for people to check in to their network on their phones, and the wet market was the 1600th most popular place in Wuhan, meaning that if you weight locations by busy-ness, there’s a less than 1/1600 chance that the first cases would be in the wet market. Yes, the wet market is indoors, has mediocre ventilation, has repeat visitors, etc. So do thousands of other places in Wuhan, like schools, hospitals, workplaces, places of worship. The wet market isn’t special in any way. And again, it wasn’t a superspreader event! COVID spread at the same rate in the wet market as it does everywhere else: doubling once per 3.5 days. It doesn’t matter what kinds of arguments you can come up with for why the wet market should have been the perfect superspreader event location, we can look at it and see that it wasn’t. It’s an environment that spreads COVID at exactly the normal rate. Zero COVID era Chinese outbreaks were concentrated in wet markets because they received infected animal products. We know why there was an outbreak in the Xinfadi Market in Beijing: it was because the seafood stall got frozen fish from some non-Zero-COVID country, the fish had COVID particles on it, and the vendor got infected and spread it to everyone else. Something like this is true for the other Chinese wet market based outbreaks we know about it. So this makes the opposite point you think it does: wet markets start outbreaks because there are infected goods being sold there. Then the virus spreads through the wet market at a completely normal rate. Saar: The Weibo list of 1600 places bigger than the wet market is likely inaccurate, because it's based on check-in data and people don't check in to seafood markets. Most of those 1600 places aren't amenable to superspread. The 70 markets supposedly bigger than Huanan are irrelevant, because they're supermarkets, open air markets, etc. Huanan is the largest seafood market in central China, and a more likely place for the first cluster of cases to be noticed. Markets weren't a common spillover location in SARS1, so the zoonosis hypothesis hasn't "called" this event in a way that should give them a high Bayes factor. And there’s still plenty of evidence for isolated (though not super-spreading) pre-market cases. A British expatriate in Wuhan, Connor Reed, says he got sick in November, three weeks before the first wet market case. Later the hospital tested his samples and said it was COVID. Another paper reports 90 cases before the first wet market one. Peter: Connor Reed was lying. The case wasn’t reported in any peer-reviewed paper. It was reported in the tabloid The Daily Mail, months after it supposedly happened. He also told the Mail that his cat died of coronavirus too, which is rare-to-impossible. Also, to get a positive hospital test, he would have had to go to the hospital, but he was 25 years old and almost no 25-year-olds go to the hospital for coronavirus. His only evidence that it was COVID was that two months later, the hospital supposedly “notified” him that it was. The hospital never informed anyone else of this extremely surprising fact which would be the biggest scientific story of the year if true. So probably he was lying. Incidentally, he died of a drug overdose shortly after giving the Mail that story; while not all drug addicts are liars, given all the other implausibilities in his story, this certainly doesn’t make him seem more credible. And in any case, he claimed he got his case at a market “like in the media” The other 90 cases are also fake. A lab leak guy found a paper that mentioned 90 more cases than other papers, and made up a conspiracy theory where the author was trying to secretly communicate that there had been 90 secret cases before any of the confirmed cases, even though there was nothing about this in the text of the paper. But actually that paper just counted cases differently than other papers, and they were referring to normal cases after the pandemic officially started. Again, I’ll come back to the discussion about inference later, but for now, here’s a table of both sides’ reasoning. This exact presentation comparing both analyses is mine3, but you can see Saar’s version here, and Peter’s starting at 45:33 of this video. Slightly made up; the two sides didn’t express their probabilities in the same way and I had to make editorial decisions to match them. Note that these aren't entirely comparable because Peter is being laxer about out-of-model probability than Saar. Although Saar's final odds here are 533-to-1, this just the central estimate. Rootclaim’s real final probability is 94% lab leak. You can see their analysis here. And The Winner Is . . . … … … … … Peter and the zoonosis hypothesis. This was a decisive victory. There were two judges, who each gave separate verdicts (or were allowed to declare a draw). Both judges decided in favor of Peter. You can see the judges’ own summary of their reasoning here (Will, Eric) Manifold agreed with the judges. There was a prediction market on who would win. It started out 70-30 in favor of lab leak. As the videos came out, zoonosis started doing better and better. I don’t want to take the exact final numbers too seriously, since I think some of the later price increases involved hints from the participants’ behavior. But it’s clear which way viewers thought the wind was blowing4. Around the same time, the Good Judgment Project - Philip Tetlock’s group studying superforecasters - put out a report on the lab leak hypothesis. After studying it in depth, his forecasters ended up 75-25 in favor of zoonosis. The Rootclaim debate was one of ten sources they said they found especially interesting. And also around the same time, and unrelated to any of this, the Global Catastrophic Risks Institute surveyed experts (“168 virologists, infectious disease epidemiologists, and other scientists from 47 countries”) and found the same thing (though see here for some potential problems with the survey): For what it’s worth, I was close to 50-50 before the debate, and now I’m 90-10 in favor of zoonosis. III. The Math And The Aftermath The third debate session was about “inference”, how to put evidence together. I put this part off until after disclosing the winner, because I wanted to talk about some of these issues at more length. The Math: Judges Both judges included a probabilistic analysis in their written decision. Here’s the same table as above, expanded to add the judges: I shoehorned the judges’ factors into the categories I already had; some of them were actually subtly different from Peter’s, Saar’s, and each other’s. The “priors” category is especially a mess here. We’ll go over these later, but I get the impression that they both thought of probabilistic analyses as an afterthought. For example, Judge Eric wrote 30,000 words about which considerations moved him, and only then includes the analysis, saying: I am not convinced that this Bayesian calculation is even an appropriate way to estimate the relative posterior probability of Z and LL; it just seemed fair that after criticizing Rootclaim’s calculations at length I should make an attempt at it myself. Judge Will’s decision ran to 10,000 words. He said he independently tried both reasoning it out intuitively, and running the Bayesian analysis, and was relieved when these two methods returned the same result. He said: I am skeptical that the Bayesian decision making/evaluation methods are any more "objective" than [intuitive reasoning]. I think they maximize legibility, not objectivity, and tend to hide the intuitive/heuristic portion in the data inclusion step and values, where it’s harder to see . . . I am not skilled in the Bayesian method, and I am sure I made significant mistakes. More time and practice would improve and refine my estimates. At the fundamental rules of the universe level, Bayesian analysis must be the best way to evaluate evidence. However, I am unsure that it’s a good strategy for a human given our cognitive limitations, and doubly unsure it’s truly being used (in the dispassionate sense) where the outcome is social desirability/fame/Twitter likes. I’m focusing on this because Saar’s opinion is that the debate went wrong (for his side) because he didn’t realize the judges were going to use Bayesian math, they did the math wrong (because Saar hadn’t done enough work explaining how to do it right), and so they got the wrong answer. I want to discuss the math errors he thinks the judges made, but this discussion would be incomplete without mentioning that the judges themselves say the numbers were only a supplement for their intuitive reasoning. That having been said, let’s look deeper into some of Saar’s concerns. The Math: Extreme Odds Saar complained that Peter’s odds were too extreme. For example, Peter said there was only a 1/10,000 chance that a lab leak pandemic would first show up at a wet market. Peter’s argument went something like: obviously a zoonotic pandemic would start at a site selling weird animals. But a lab leak pandemic - if it didn’t start at the lab - could show up anywhere. 1/10,000 Wuhan citizens work at the wet market. So if a lab leak was going to show up somewhere random, the wet market was a 1/10,000 chance. Saar had specific arguments against this, but he also had a more general argument: you should rarely see odds like 1/10,000 outside of well-understood domains. In his blog post, he gave this example: A prosecutor shows the court a statistical analysis of which DNA markers matched the defendant and their prevalence, arriving at a 1E-9 probability they would all match a random person, implying a Bayes factor near 1E9 for guilty. But if we try to estimate p(DNA|~guilty) by truly assuming innocence, it is immediately evident how ridiculous it is to claim only 1 out of a billion innocent suspects will have a DNA match to the crime scene. There are obviously far better explanations like a lab mistake, framing, an object of the suspect being brought by someone to the scene, etc. So the real p(wet market|lab leak) isn’t the 1/10,000 chance a pandemic arising in a random place hits the wet market, but the (higher?) probability that there’s something wrong with Peter’s argument. Then Saar tried to show specific things that might be wrong with Peter’s argument. I didn’t find his specific examples convincing. But maybe the question shouldn’t be whether I agreed with him. It should be whether I’m so confident he’s wrong that I would give it 10,000-to-1 odds. This makes total sense, it’s absolutely true, and I want to be really, really careful with it. If you take this kind of reasoning too far, you can convince yourself that the sun won’t rise tomorrow morning. All you have to do is propose 100 different reasons the sunrise might not happen. For example: The sun might go nova.
Inline links: NPR, found that, Mr. Chen, failed to find a clear connection between COVID and raccoon-dogs, https://substackcdn.com/image/fetch/$s_!8Led!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e46d3a-c19e-407b-88b7-1d8ea1489df7_1190x503.png, Pekar 2022, Pipes 2021, says, 3, here, this video, https://substackcdn.com/image/fetch/$s_!8aU2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815fe32d-d7ea-401b-b3a2-d8cd25b52ee8_490x780.png, https://substackcdn.com/image/fetch/$s_!0Tm_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0492f69-7b7e-4611-9d76-64ef8d7f59d5_511x511.png, Will, Eric, agreed, 4, put out a report on the lab leak hypothesis, https://substackcdn.com/image/fetch/$s_!g7k2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37f1b493-b556-41ec-925e-03f9d8bc26cb_1456x849.webp, surveyed experts, see here, https://substackcdn.com/image/fetch/$s_!Zejl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9c88e87-b6ca-4c6d-840e-24da726f50b7_975x365.png, https://substackcdn.com/image/fetch/$s_!T5rV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4983e2cd-4151-42de-9685-08037ef7a8e8_635x788.png
Lineage A (left) was used by the Minoan Cretans, but has never been deciphered. Lineage B (right) was used by the Mycaeneans for lists of palace goods. This matches Saar’s story above. The lab leaked to somewhere else in Wuhan, not the wet market. The virus spread undetected in the population for a while. During this time, it mutated to Lineage B. Then one of the people with Lineage B went to the wet market and started a superspreader event. The authorities sampled the patients, found Lineage B, then started looking elsewhere. Later they detected some of the earlier Lineage A cases. The market is unlikely to be the origin of the pandemic, because the original Lineage A strain wasn’t found there. Peter: Although Lineage A is evolutionarily older, Lineage B started spreading in humans first. We know this because Lineage B is more common. Throughout the early pandemic, until the D614G variant drove all other strains extinct, a consistent 2/3 of the cases were B, compared to 1/3 A. Both strains spread at the same rate, so the best explanation is that B started earlier than A. Since COVID doubles every 3-4 days, probably Lineage B started 3-4 days earlier than Lineage A, which explains why it’s always been twice as many cases. But also, Lineage B also has more internal genetic diversity than Lineage A. In general, older viruses have more genetic diversity (the “molecular clock”). This is further evidence that B started spreading first. Pekar 2022 and Pipes 2021 do analyses with known parameters for spread rate and diversity, and find 90%+ odds that Lineage B was the first one in humans. Why did the older strain start spreading later? Probably the virus crossed from bats into raccoon-dogs on some raccoon-dog farm out in the country. It spread in the raccoon-dogs for a while, racking up mutations, including the (less mutated) Lineage A strain and the (slightly more mutated) Lineage B strain. Then several raccoon-dogs were taken to Wuhan for sale, including one with Lineage A and another with Lineage B. The one with Lineage B passed its virus to humans earlier. Then 3-4 days later, the Lineage A one passed its virus to humans. Lineage A was first found in a Wuhan neighborhood right next to the wet market (closer to the wet market than 97% of Wuhan’s population). Again, it would be a bizarre coincidence if a lab leak pandemic was first detected at a wet market. But it would be an even more bizarre coincidence if a lab leak pandemic separated into two strains, and both were first detected at a wet market! Although no known wet market cases were Lineage A, a positive Lineage A environmental sample was found at the wet market, and everyone agrees most cases went undetected. So maybe the Lineage B raccoon-dog spread its virus to a vendor, and that sub-strain mostly stayed in the market. But the Lineage A raccoon-dog spread its virus to a customer, who went back to his house nearby, and that strain spread in the neighborhoods next to the market. This is the only story that explains the evolutionary precedence of A, the greater spread and older molecular clock of B, and the fact that both strains were first found very close to the wet market. Yuri/Saar: Lineage B could be more common and diverse because it got the advantage of a super-spreader event in the wet market. There are a few scattered cases of intermediates between A and B, and a few other scattered cases of lineages that seem even more ancestral (ie closer to the bat virus) than either. This doesn’t make sense in a double spillover hypothesis. But it does make sense if the lineages separated in human transmission somewhere between the lab and the first super-spreader event at the wet market. Peter: Again, the wet market wasn’t a super-spreader event. COVID spread in the wet market at exactly its normal spread rate, doubling about once every 3.5 days. Stop calling the wet market a super-spreader event. The scattered cases of “intermediates” are sequencing errors. They were all found by the same computer software, which “autofills” unsequenced bases in a genome to the most plausible guess. Because Lineage B was already in the software, depending on which part of a Lineage A virus you sequenced, you might get one half or the other autofilled as Lineage B, which looked like an “intermediate”. We know this because all the supposed “intermediates” were partial cases sequenced by this particular software. We can confirm this by noting that there are too many intermediates! That is, where Lineage A is (T/C) and Lineage B is (C/T), the software found both (T/T) “intermediates” and (C/C) “intermediates”. But obviously there can only be one real intermediate form, and we have to dismiss one or the other. But in fact we can dismiss both, because they were both caused by the same software bug. The scattered “progenitor” cases - those closer to the ancestral bat virus than either A or B - are reversions, ie cases where a new mutation in the virus happened to hit an already-mutated base and shift it back towards the ancestral virus. We know this because all of these “progenitors” were scattered cases found months after the pandemic started, often in entirely different countries from Wuhan. If these were real progenitor viruses, they would have either fizzled out or exploded into a substantial portion of all cases, not be found one time in one guy in Malaysia. Given the number of mutations the virus developed over the course of the pandemic, it’s inevitable that some of them would be mutations that bring it closer to the original bat virus, and in fact we find the number of “progenitors” found very nicely matches the number of progenitor-appearing viruses we would expect by chance. And in many cases, we know the “progenitors” are newer than the original lineages, because they also have some of the later mutations that Lineage A or B picked up along the way, alongside their apparent ancestral-bat-virus-like mutations. Session 2: Viral Genetics Yuri: Two years before COVID, scientists at the Wuhan Institute of Virology, together with colleagues at the University of North Carolina, sent in a grant proposal for the DEFUSE program. This program, intended to locate and better understand potential future pandemic viruses, involved going into bat caves and collecting new coronaviruses. Once they had them, they would do gain-of-function: specifically, they would add a furin cleavage site to make them more infectious and see what happened. (quick interlude: COVID’s spike protein has two sections: one binds to human cells through the ACE2 receptor, the other helps fuse with the cell after binding. In order to avoid the immune system, it hides both of these into one spike. But when it reaches a cell, it needs to separate them again. It takes advantage of a human respiratory enzyme, furin, to do the separation - this also ensures that it only infects its primary target, human respiratory cells. The part of COVID that lets it get separated by furin is called the “furin cleavage site”. COVID’s bat-virus ancestors were gastrointestinal viruses; the addition of a furin cleavage site was what made them respiratory viruses.) We’ve found two close relatives of COVID: bat viruses called RATG-13 and BANAL-52. In particular, COVID looks more or less like BANAL-52 plus a furin cleavage site. There are 1500 sarbecoviruses, members of the family of viruses that includes SARS and SARS2/COVID. None of them except COVID have furin cleavage sites. BANAL-52, COVID’s closest ancestor, doesn’t even have anything resembling one that could mutate into a functional furin cleavage site like COVID’s. Instead, COVID - which mostly just resembles BANAL-52 with a few scattered single-point mutations - has twelve completely new nucleotides in a row - a fully formed furin cleavage site that came out of nowhere. There is nowhere else in the genome that COVID differs from BANAL-52 in such a profound way. It’s just BANAL-52 plus a little bit of random mutation plus a fully-formed furin cleavage site that came out of nowhere. Further, the furin cleavage site is weird. It uses the protein arginine twice. But instead of the nucleotides coding for arginine in the usual viral way, both times it uses the codons CGG - the way that higher animals code for arginine. This works fine - it’s just not how viruses do it. So the obvious conclusion is that WIV, which said in 2018 that it was going to find viruses and add furin cleavage sites to them, found a close relative of BANAL-52 and added a furin cleavage site. Since they were humans, and most familiar with the human way of encoding arginine, they added it as CGG both times. COVID seemed surprisingly optimized for infecting humans. Of fifty animals it was tested in, including the usual coronavirus intermediate hosts (pangolins, raccoon-dogs, etc), it was best at infecting human cells. Further, a virus that enters a new species will usually show a burst of mutations as it “figures out” the best way to adapt to that species’ unique biology. But COVID has had a pretty constant mutation rate in humans, from the beginning of the pandemic to the end. That suggests it was already adapted to humans. This could be because the lab screened for viruses with existing adaptations, because they passed it through humanized mice in the lab, or because it adapted in the hundreds of undetected cases that happened between the lab and detection in the wet market. Usually, research with potentially dangerous coronaviruses is done in BSL-3 or 4, ie high to very-high security. But WIV was irresponsibly doing it in BSL-2, ie medium security. The researchers weren’t even required to wear masks. In general, about 1/500 labs will leak any given pathogen they’re working on (?!). But because WIV was researching such an infectious virus in such an irresponsible way, the odds of a leak were much higher. The most likely explanation for all these facts is that WIV went ahead and did the gain-of-function research they said they were going to do (the particular DEFUSE grant proposal we know about got rejected, but it proves that Wuhan wanted to do this, and they could easily have gotten funding somewhere else, or done it out of their regular budget). They found a close relative of BANAL-52 and added a furin cleavage site as a simple twelve-nucleotide insertion, using the human method of encoding arginine that their genetic engineers were familiar with. Then it leaked, spread for a while in the general Wuhan population, and eventually made it to the wet market where it got detected. Peter: As mentioned earlier, the DEFUSE grant was rejected. Further, the grant said that the Wuhan Institute of Virology was responsible for finding the viruses, and the University of North Carolina would do all the gain-of-function research. This was a reasonable division of labor, since UNC was actually good at gain-of-function research, and WIV mostly wasn’t. They had done a few very simple gain-of-function projects before, but weren’t really set up for this particular proposal and were happy to leave it for their American colleagues. Even if WIV did try to create COVID, they couldn’t have. As Yuri said, COVID looks like BANAL-52 plus a furin cleavage site. But WIV didn’t have BANAL-52. It wasn’t discovered until after the COVID pandemic started, when scientists scoured the area for potential COVID relatives. WIV had a more distant COVID relative, RATG-13. But you can’t create COVID from RATG-13; they’re too different. You would need BANAL-52, or some as-yet-undiscovered extremely close relative. WIV had neither. Are we sure they had neither? Yes. Remember, WIV’s whole job was looking for new coronaviruses. They published lists of which ones they had found pretty regularly. They published their last list in mid-2019, just a few months before the pandemic. Although lab leak proponents claimed these lists showed weird discrepancies, this was just their inability to keep names consistent, and all the lists showed basically the same viruses (plus a few extra on the later ones, as they kept discovering more). The lists didn’t include BANAL-52 or any other suitable COVID relatives - only RATG-13, which isn’t close enough to work. Could they have been keeping their discovery of BANAL-52 secret? No. Pre-pandemic, there was nothing interesting about it; our understanding of virology wasn’t good enough to point this out as a potential pandemic candidate. WIV did its gain-of-function research openly and proudly (before the pandemic, gain-of-function wasn’t as unpopular as it is now) so it’s not like they wanted to keep it secret because they might gain-of-function it later. Their lists very clearly showed they had no virus they could create COVID from, and they had no reason to hide it if they did. COVID’s furin cleavage site is admittedly unusual. But it’s unusual in a way that looks natural rather than man-made. Labs don’t usually add furin cleavage sites through nucleotide insertions (they usually mutate what’s already there). On the other hand, viruses get weird insertions of 12+ nucleotides in nature. For example, HKU1 is another emergent Chinese coronavirus that caused a small outbreak of pneumonia in 2004. It had a 15 nucleotide insertion right next to its furin cleavage site. Later strains of COVID got further 12 - 15 nucleotide insertions. Plenty of flus have 12 to 15 nucleotide insertions compared to other earlier flu strains. Sometimes insertions happen because of a mistake in viral replication. Other times the virus gets confused between its own RNA and its host’s, and splices a bit of the host RNA into the virus. This would neatly explain why the insertion used the unusual coding CGG for arginine, which is common in animals but rare in viruses. On the other hand, it’s not that rare in viruses - COVID uses CGG for arginine about 3% of the time. And human engineers don’t necessarily use it any more than that - Peter was able to find one example of humans adding arginine to a virus, and 0 out of the 5 arginines added were CGG. COVID’s furin cleavage site is a mess. When humans are inserting furin cleavage sites into viruses for gain-of-function, the standard practice is RRKR, a very nice and simple furin cleavage site which works well. COVID uses PRRAR, a bizarre furin cleavage site which no human has ever used before, and which virologists expected to work poorly. They later found that an adjacent part of COVID’s genome twisted the protein in an unusual way that allowed PRRAR to be a viable furin cleavage site, but this discovery took a lot of computer power, and was only made after COVID became important. The Wuhan virologists supposedly doing gain-of-function research on COVID shouldn’t have known this would work. Why didn’t they just use the standard RRKR site, which would have worked better? Everyone thinks it works better! Even the virus eventually decided it worked better - sometime during the course of the pandemic, it mutated away from its weird PRRAR furin cleavage site towards a more normal form. Further, COVID’s furin cleavage site was inserted via what seems to be a frameshift mutation - it wasn’t a clean insertion of the amino acids that formed the site, it was an insertion of a sequence which changed the context of the surrounding nucleotides into the amino acids that formed the site. This is a pointless too-clever-by-half “flourish” that there would be no reason for a human engineer to do. But it’s exactly the kind of weird thing that happens in the random chance of evolution. COVID is hard to culture. If you culture it in most standard media or animals, it will quickly develop characteristic mutations. But the original Wuhan strains didn’t have these mutations. The only ways to culture it without mutations are in human airway cells, or (apparently) in live raccoon-dogs. Getting human airway cells requires a donor (ie someone who donates their body to science), and Wuhan had never done this before (it was one of the technologies only used at the superior North Carolina site). As for raccoon-dogs, it sure does seems suspicious that the virus is already suited to them. The claim that COVID is uniquely adapted to humans is false. The paper that claimed that defined how well COVID was adapted to different animals by those animals’ difference (on the relevant cell receptors) from humans. So in its methodology, humans came out #1 by default. If you don’t do that, COVID is better-adapted to many other animals. It’s not necessarily true that viruses see a burst of mutations when they enter a new host. COVID spread to deer and mink, and in neither case was there a burst of mutations. COVID has a pretty simple job of infecting respiratory cells and is already very good at it, regardless of species. In Yuri’s model, Wuhan Institute of Virology picked up a discarded grant and decided to do the gain-of-function half allotted to a different university, despite their relative inexperience. They skipped over all the SARS-like viruses they were supposed to work on, and all the standard gain-of-function model backbones, in favor of BANAL-52, a virus which would not be discovered for another two years, but which they somehow had samples of, which they had for some reason decided to keep secret despite its total lack of interestingness. Then they would have had to eschew all usual gain-of-function practices in favor of inserting a weird furin cleavage site that shouldn’t have worked according to the theory they had at the time, via a frameshift mutation. Then they would have had to culture it, a technique beyond their limited capabilities. Then it would have had to leak, and magically show up again in front of the raccoon-dog stall at a wet market. Yuri: WIV wouldn’t have needed to keep BANAL-52 “secret” in some kind of sinister way. Plenty of researchers have backlogs of work they haven’t published yet. Probably they a found BANAL relative in one of their normal sampling trips, did some preliminary studies on it, and planned to publish it later once they cleaned up their data. Everyone works like this. The part of DEFUSE saying that they would only work on viruses that were 95% similar to SARS is unclear and might mean something else. It looks more like they say they’ll start with those viruses, but also do some work on novel viruses. BANAL-52 could have been one of the novel viruses. The furin cleavage site is weird, but the researchers might have done that on purpose, to make the virus easier to keep track of, or to test different furin cleavage sites. Depending on the exact BANAL-52 relative they used, it might not even be a frameshift; there’s a particular way to spell serine that would make the insertion more natural. The claims that COVID can’t be cultured in normal media are based on speculative original research by Peter and might not hold up. Peter: WIV did most of its virus-gathering in a trip to a Yunnan cave between 2010 and 2015. All those viruses have long since been processed and added to the database. There’s no sign that they made more trips to Yunnan caves, and no reason for them to keep that secret. So the idea that they might just have some new viruses they didn’t publish doesn’t hold up. But suppose they did make more trips. Given the amount of time between the DEFUSE proposal and COVID, if they kept to their normal virus-collection rate, they would have gotten about thirty new viruses. What’s the chance that one of those was BANAL-52? There are thousands of bat viruses, and BANAL-52 is so rare that it wasn’t found until well after the pandemic started and people were looking for it very hard. So the chance that one of their 30 would be BANAL-52 is low. Also, they said in DEFUSE that they planned to go back to the same Yunnan cave. But BANAL-52 was found far away from that cave, so unless it ranged over a wide area, they probably couldn’t have found it even if they got very lucky. Session 3: Closing Arguments This third debate was supposed to be about “inference”, ie how much Bayesian evidence was provided by each of the facts given so far, and how to fit them into the Rootclaim probabilistic model. I’m going to relegate my summary of the more probabilistic half to the next section of this post, and just include the closing arguments here. Saar: Peter’s case hinges on the idea that it’s very improbable that a lab leak pandemic would first show up at a wet market. But this isn’t necessarily improbable. The Huanan Seafood Market had several factors that made it a likely location for a superspreader event. It was busy, with over 10,000 visitors a day. Many of the people there (eg the 1,000 vendors) came back daily, letting them reinfect each other. It had poor ventilation, especially in the high-positivity area near the raccoon-dog stall. It had cold wet surfaces on which the virus could survive for long periods. It was indoors, which prevented UV light from killing the virus. Given a small amount of sporadic COVID going around Wuhan, it’s not surprising for the first place it started spreading en masse to be a wet market. In fact, we have several examples of this. When China was COVID Zero, there would occasionally be small outbreaks that the authorities would have to contain. Most of these were at wet markets. For example, the big COVID outbreak in Beijing started at Xinfadi Market, their local seafood market. This couldn’t be an animal spillover, because there were no raccoon-dogs or other weird wildlife there. So it must be that wet markets are natural places for superspreader events. There are several other examples, which make up about half of the total outbreaks in Zero COVID era China, plus others in Singapore and Thailand. Since COVID clusters concentrate in wet markets even when there is no animal spillover, we should accept this as a property of the virus, and not attribute any significance to the fact that this happened in Wuhan too. Peter: About 1/10,000 citizens of Wuhan was a wet market vendor. So there’s a 1/10,000 chance that the first known COVID case should be a wet market vendor by chance alone. Weibo lists the most popular places for people to check in to their network on their phones, and the wet market was the 1600th most popular place in Wuhan, meaning that if you weight locations by busy-ness, there’s a less than 1/1600 chance that the first cases would be in the wet market. Yes, the wet market is indoors, has mediocre ventilation, has repeat visitors, etc. So do thousands of other places in Wuhan, like schools, hospitals, workplaces, places of worship. The wet market isn’t special in any way. And again, it wasn’t a superspreader event! COVID spread at the same rate in the wet market as it does everywhere else: doubling once per 3.5 days. It doesn’t matter what kinds of arguments you can come up with for why the wet market should have been the perfect superspreader event location, we can look at it and see that it wasn’t. It’s an environment that spreads COVID at exactly the normal rate. Zero COVID era Chinese outbreaks were concentrated in wet markets because they received infected animal products. We know why there was an outbreak in the Xinfadi Market in Beijing: it was because the seafood stall got frozen fish from some non-Zero-COVID country, the fish had COVID particles on it, and the vendor got infected and spread it to everyone else. Something like this is true for the other Chinese wet market based outbreaks we know about it. So this makes the opposite point you think it does: wet markets start outbreaks because there are infected goods being sold there. Then the virus spreads through the wet market at a completely normal rate. Saar: The Weibo list of 1600 places bigger than the wet market is likely inaccurate, because it's based on check-in data and people don't check in to seafood markets. Most of those 1600 places aren't amenable to superspread. The 70 markets supposedly bigger than Huanan are irrelevant, because they're supermarkets, open air markets, etc. Huanan is the largest seafood market in central China, and a more likely place for the first cluster of cases to be noticed. Markets weren't a common spillover location in SARS1, so the zoonosis hypothesis hasn't "called" this event in a way that should give them a high Bayes factor. And there’s still plenty of evidence for isolated (though not super-spreading) pre-market cases. A British expatriate in Wuhan, Connor Reed, says he got sick in November, three weeks before the first wet market case. Later the hospital tested his samples and said it was COVID. Another paper reports 90 cases before the first wet market one. Peter: Connor Reed was lying. The case wasn’t reported in any peer-reviewed paper. It was reported in the tabloid The Daily Mail, months after it supposedly happened. He also told the Mail that his cat died of coronavirus too, which is rare-to-impossible. Also, to get a positive hospital test, he would have had to go to the hospital, but he was 25 years old and almost no 25-year-olds go to the hospital for coronavirus. His only evidence that it was COVID was that two months later, the hospital supposedly “notified” him that it was. The hospital never informed anyone else of this extremely surprising fact which would be the biggest scientific story of the year if true. So probably he was lying. Incidentally, he died of a drug overdose shortly after giving the Mail that story; while not all drug addicts are liars, given all the other implausibilities in his story, this certainly doesn’t make him seem more credible. And in any case, he claimed he got his case at a market “like in the media” The other 90 cases are also fake. A lab leak guy found a paper that mentioned 90 more cases than other papers, and made up a conspiracy theory where the author was trying to secretly communicate that there had been 90 secret cases before any of the confirmed cases, even though there was nothing about this in the text of the paper. But actually that paper just counted cases differently than other papers, and they were referring to normal cases after the pandemic officially started. Again, I’ll come back to the discussion about inference later, but for now, here’s a table of both sides’ reasoning. This exact presentation comparing both analyses is mine3, but you can see Saar’s version here, and Peter’s starting at 45:33 of this video. Slightly made up; the two sides didn’t express their probabilities in the same way and I had to make editorial decisions to match them. Note that these aren't entirely comparable because Peter is being laxer about out-of-model probability than Saar. Although Saar's final odds here are 533-to-1, this just the central estimate. Rootclaim’s real final probability is 94% lab leak. You can see their analysis here. And The Winner Is . . . … … … … … Peter and the zoonosis hypothesis. This was a decisive victory. There were two judges, who each gave separate verdicts (or were allowed to declare a draw). Both judges decided in favor of Peter. You can see the judges’ own summary of their reasoning here (Will, Eric) Manifold agreed with the judges. There was a prediction market on who would win. It started out 70-30 in favor of lab leak. As the videos came out, zoonosis started doing better and better. I don’t want to take the exact final numbers too seriously, since I think some of the later price increases involved hints from the participants’ behavior. But it’s clear which way viewers thought the wind was blowing4. Around the same time, the Good Judgment Project - Philip Tetlock’s group studying superforecasters - put out a report on the lab leak hypothesis. After studying it in depth, his forecasters ended up 75-25 in favor of zoonosis. The Rootclaim debate was one of ten sources they said they found especially interesting. And also around the same time, and unrelated to any of this, the Global Catastrophic Risks Institute surveyed experts (“168 virologists, infectious disease epidemiologists, and other scientists from 47 countries”) and found the same thing (though see here for some potential problems with the survey): For what it’s worth, I was close to 50-50 before the debate, and now I’m 90-10 in favor of zoonosis. III. The Math And The Aftermath The third debate session was about “inference”, how to put evidence together. I put this part off until after disclosing the winner, because I wanted to talk about some of these issues at more length. The Math: Judges Both judges included a probabilistic analysis in their written decision. Here’s the same table as above, expanded to add the judges: I shoehorned the judges’ factors into the categories I already had; some of them were actually subtly different from Peter’s, Saar’s, and each other’s. The “priors” category is especially a mess here. We’ll go over these later, but I get the impression that they both thought of probabilistic analyses as an afterthought. For example, Judge Eric wrote 30,000 words about which considerations moved him, and only then includes the analysis, saying: I am not convinced that this Bayesian calculation is even an appropriate way to estimate the relative posterior probability of Z and LL; it just seemed fair that after criticizing Rootclaim’s calculations at length I should make an attempt at it myself. Judge Will’s decision ran to 10,000 words. He said he independently tried both reasoning it out intuitively, and running the Bayesian analysis, and was relieved when these two methods returned the same result. He said: I am skeptical that the Bayesian decision making/evaluation methods are any more "objective" than [intuitive reasoning]. I think they maximize legibility, not objectivity, and tend to hide the intuitive/heuristic portion in the data inclusion step and values, where it’s harder to see . . . I am not skilled in the Bayesian method, and I am sure I made significant mistakes. More time and practice would improve and refine my estimates. At the fundamental rules of the universe level, Bayesian analysis must be the best way to evaluate evidence. However, I am unsure that it’s a good strategy for a human given our cognitive limitations, and doubly unsure it’s truly being used (in the dispassionate sense) where the outcome is social desirability/fame/Twitter likes. I’m focusing on this because Saar’s opinion is that the debate went wrong (for his side) because he didn’t realize the judges were going to use Bayesian math, they did the math wrong (because Saar hadn’t done enough work explaining how to do it right), and so they got the wrong answer. I want to discuss the math errors he thinks the judges made, but this discussion would be incomplete without mentioning that the judges themselves say the numbers were only a supplement for their intuitive reasoning. That having been said, let’s look deeper into some of Saar’s concerns. The Math: Extreme Odds Saar complained that Peter’s odds were too extreme. For example, Peter said there was only a 1/10,000 chance that a lab leak pandemic would first show up at a wet market. Peter’s argument went something like: obviously a zoonotic pandemic would start at a site selling weird animals. But a lab leak pandemic - if it didn’t start at the lab - could show up anywhere. 1/10,000 Wuhan citizens work at the wet market. So if a lab leak was going to show up somewhere random, the wet market was a 1/10,000 chance. Saar had specific arguments against this, but he also had a more general argument: you should rarely see odds like 1/10,000 outside of well-understood domains. In his blog post, he gave this example: A prosecutor shows the court a statistical analysis of which DNA markers matched the defendant and their prevalence, arriving at a 1E-9 probability they would all match a random person, implying a Bayes factor near 1E9 for guilty. But if we try to estimate p(DNA|~guilty) by truly assuming innocence, it is immediately evident how ridiculous it is to claim only 1 out of a billion innocent suspects will have a DNA match to the crime scene. There are obviously far better explanations like a lab mistake, framing, an object of the suspect being brought by someone to the scene, etc. So the real p(wet market|lab leak) isn’t the 1/10,000 chance a pandemic arising in a random place hits the wet market, but the (higher?) probability that there’s something wrong with Peter’s argument. Then Saar tried to show specific things that might be wrong with Peter’s argument. I didn’t find his specific examples convincing. But maybe the question shouldn’t be whether I agreed with him. It should be whether I’m so confident he’s wrong that I would give it 10,000-to-1 odds. This makes total sense, it’s absolutely true, and I want to be really, really careful with it. If you take this kind of reasoning too far, you can convince yourself that the sun won’t rise tomorrow morning. All you have to do is propose 100 different reasons the sunrise might not happen. For example: The sun might go nova.
Inline links: Pekar 2022, Pipes 2021, says, 3, here, this video, https://substackcdn.com/image/fetch/$s_!8aU2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815fe32d-d7ea-401b-b3a2-d8cd25b52ee8_490x780.png, https://substackcdn.com/image/fetch/$s_!0Tm_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0492f69-7b7e-4611-9d76-64ef8d7f59d5_511x511.png, Will, Eric, agreed, 4, put out a report on the lab leak hypothesis, https://substackcdn.com/image/fetch/$s_!g7k2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37f1b493-b556-41ec-925e-03f9d8bc26cb_1456x849.webp, surveyed experts, see here, https://substackcdn.com/image/fetch/$s_!Zejl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9c88e87-b6ca-4c6d-840e-24da726f50b7_975x365.png, https://substackcdn.com/image/fetch/$s_!T5rV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4983e2cd-4151-42de-9685-08037ef7a8e8_635x788.png
Before going further, I recommend reading page 8 of the supplementary text of Worobey’s paper, titled “Robustness Of Statistical Test Results To Ascertainment Bias”, or pages 14-17, “Additional Data Related To Case Ascertainment Biases”, which explain all the reasons he thinks this isn’t true. I promise you aren’t the first person to think that maybe Worobey could be contaminated by ascertainment bias. If that still doesn’t help, Worobey talks more about his strategy for avoiding ascertainment bias here. Most important, he counted only cases from December; the market connection was discovered December 30 and added to diagnostic criteria January 3. This doesn’t mean bias is impossible - some of these points are people who caught COVID on December 31, but only got diagnosed January 4 after the new diagnostic criteria were added. But most cases are pre-criteria. And Worobey looked at various subsets of pre-criteria cases and found they were all at least as market-focused as the overall set. For example, he looked at the earliest COVID records in one Wuhan hospital system: 10 of these hospitals’ 19 earliest COVID-19 cases were linked to Huanan Market (∼53%), comparable both to Jinyintan’s 66% (of 41 cases) (4) and to the WHO-China report’s 33% of 168 retrospectively identified cases within Wuhan across December 2019 (1). Regarding cases at the Wuhan Central Hospital and HPHICWM, patients with a history of exposure at Huanan Market could not have been “cherry picked” before anyone had identified the market as an epidemiologic risk factor. Hence, there was a genuine preponderance of early COVID-19 cases associated with Huanan Market. Likewise, a study conducted January 2 (so not impacted at all by the January 3 criteria) found that 27 of 41 known patients had market links. Likewise, the first five cases were all detected in the market, and it doesn’t even make sense to talk about ascertainment bias for these. What is the Weissman paper that observeralt is talking about? It argues: if the pandemic started at the market, each seemingly non-market-linked case must ultimately derive from a market-linked case. Therefore, we should expect non-market-linked cases to require more steps than market-linked cases. Therefore, they should be further away. But if we look at the map above, we see that not-market-linked cases are closer to the market than market-linked cases. So something must be wrong, and that something might be ascertainment bias. (at least this is my interpretation of Weissman’s argument, which is more mathematical; read the paper to make sure I’m getting it right). This is a weirdly spherical-cow view of an epidemic, worthy of a physicist. It’s easy to think of reasons the linked-cases-should-be-closer rule might not hold. For example, suppose that on their lunch break, market vendors go have lunch at restaurants surrounding the market. They infect people in these restaurants, who then infect their friends and family. But these people never went to the market themselves. Now there are a bunch of non-market-linked cases immediately surrounding the wet market. But also - of all markets in Wuhan, Huanan sold the most weird wildlife. Suppose someone in the boonies gets a craving for raccoon-dog one day, their local convenience store doesn’t have it, so they hop on a bus and go downtown to the city’s main wet market. Then they get infected with COVID. Now there’s a wet-market-linked case in the boonies. In other words, we should expect two modes of spread: general geographic diffusion from the epicenter, and people from far away who made specific trips. If this still doesn’t seem obvious to you, consider - usually when COVID first arrived in America or Brazil or wherever, they were able to trace it back to a specific person from Wuhan who visited the country. If I was the first person in America to get COVID, I could usually say “Oh, it must have been my business meeting with Mr. Chin from Wuhan”. At the same time, if someone from the next town over from Wuhan got COVID, they probably couldn’t trace it back to a specific Wuhanite - everyone from Wuhan is coming and going so often that my town is just full of COVID in general. So I don’t think Weissman’s paper proves anything, and I think the general pattern of blue and orange dots suggests ascertainment bias wasn’t playing a role. So why does George Gao say that there was ascertainment bias? I looked for the direct source of the Gao quote and couldn’t find it; if someone else is able to, please let me know, since I’d be interested in exactly what he thinks about this. 1.10: Connor Reed / Gwern on cats Gwern wrote: Yes, I don't understand this (paraphrased) claim by Peter: > He also told the Mail that his cat got the coronavirus too, which is impossible. 'Impossible', thus implying the man was lying? I was under the impression that, quite aside from cats having tons of coronaviruses in general (FCoV being a particularly serious threat to young cats, which also seems to be a remarkable case study of the harms of the FDA), that it was not just not 'impossible' for domestic pet cats to get the coronavirus too, it was routine for them to get COVID-19, and even other cat species in *zoos* have tested positive and this was true very early in the COVID-19 pandemic and quite well publicized and well known (eg April 2020 https://www.nationalgeographic.com/animals/article/tiger-coronavirus-covid19-positive-test-bronx-zoo ). This was a topic of interest to me at the time because I like cats and have a cat and was wondering what the implications of me being inevitably infected might be for my cat, and so I remember this quite well despite my general attempt to remain ignorant of as many COVID-19 matters as possible... And double-checking now to see if all of these reports were somehow false positives or faked, I continue to see everyone like the CDC stating that it is still totally possible and routine for cats in close contact with infected humans (you know, like a *pet* cat) to be infected with COVID-19: https://www.cdc.gov/healthypets/covid-19/pets.html Given that Peter has supposedly spent years autistically researching every last detail and this detail in particular in order to discredit that British dude, I'm experiencing sudden Gell-Man Amnesia here about the rest of his claims, as well as the supposed experts evaluating Peter's claims if they didn't flag that (I have not checked). This is in the context of Connor Reed, a British man who claimed to have gotten COVID on November 25 - which, if true, would be surprisingly (though not impossibly) early according to the zoonosis narrative. Peter argued his story didn’t hold up, and one of his points centered around his claim that his cat might have caught COVID from him and died. Unfortunately, I mis-quoted Peter. I said Peter argued it was impossible for his cat to get COVID-19 (false). His actual statement was that it’s extremely rare for a cat to die of COVID-19. Peter, Gwern, and I then proceeded to get very confused about the exact claims and timeline, which I think is because Connor said totally different things in different interviews: In an interview with Wales Online on 2/4/2020, he said that "my kitten caught the feline coronavirus and developed pneumonia and died, but I don't think I caught it from her. I think that was just coincidence.”
Inline links: here, 4, 1, a study, the Weissman paper, https://www.nationalgeographic.com/animals/article/tiger-coronavirus-covid19-positive-test-bronx-zoo, https://www.cdc.gov/healthypets/covid-19/pets.html, an interview with Wales Online
HKU1 might also fit these criteria. It’s a coronavirus discovered in 2004 that seems to have spilled over in China and spread globally (it’s fine; it just causes yet another subtype of common cold). The exact animal reservoir has never been identified, although Wikipedia says it “likely originated from rodents”.
In May 2003, Guan et al (2003) identified SARS-CoV-like virus in animals in a live-animal market in Shenzhen, Guangdong Province, China. Guan et al (2003) also tested for antibodies among workers in the market. They note that “8 out of 20 (40%) of the wild-animal traders and 3 of 15 (20%) of those who slaughter these animals had evidence of antibody, only 1 (5%) of 20 vegetable traders was seropositive.” This suggests that the majority of the infections of the 11 people with close contact with animals were zoonotic. Among 508 animal traders, 66 (13%) tested positive for IgG antibody to SARS associated coronavirus by ELISA, while the control groups including hospital workers, Guangdong CDC workers, and healthy adults at clinic had an antibody prevalence of 1–3%.
Backlinks
- Adumbrations Of Aducanumab
- Anthony Fauci
- Book Review: Antifragile
- Concepts: C
- Concepts: H
- Concepts: J
- Concepts: L
- Concepts: P
- Concepts: R
- Concepts: S
- Concepts: W
- Connor Reed
- Vitamin D: Much More Than You Wanted To Know
- Cuomo
- Events: B
- Events: E
- Events: S
- Galton, Ehrlich, Buck
- Highlights From The Comments On The Lab Leak Debate
- HKU1
- Huanan
- Lineage A
- Links For March
- 26
- Metaculus Monday
- Michael Weissman
- Mr. Chen
- Osama bin Laden
- People: A
- People: B
- People: C
- People: M
- People: O
- Practically-A-Book Review: Rootclaim $100,000 Lab Leak Debate
- raccoon-dogs
- Saar
- Sanders campaign
- SARS1
- The Rise And Fall Of Online Culture Wars
- Venues: H
- Venues: W
- Venues: X
- WebMD, And The Tragedy Of Legible Expertise
- wet market
- Wuhan
- Wuhan Institute of Virology
- Xinfadi Market
- zoonosis