Yunnan
Article
Yunnan is a recurring place in the Astral Codex Ten archive, appearing 4 times across 4 issues between June 17, 2021 and April 09, 2024. The archive places it in contexts such as “The plague probably existed in say Yunnan, China”; “Yunnan province and Laos, which are more than a thousand kilometers away from Wuhan”; “even further than Yunnan”. It most often appears alongside China, BANAL-52, COVID-19.
Metadata
- Category: Places
- Mention count: 4
- Issue count: 4
- First seen: June 17, 2021
- Last seen: April 09, 2024
Appears In
- Your Book Review: Plagues And Peoples
- Your Book Review: Viral
- Practically-A-Book Review: Rootclaim $100,000 Lab Leak Debate
- Highlights From The Comments On The Lab Leak Debate
Related Pages
-
- China (4 shared issues)
-
- BANAL-52 (3 shared issues)
-
- COVID-19 (3 shared issues)
-
- furin cleavage site (3 shared issues)
-
- Huanan seafood market (3 shared issues)
-
- Rootclaim (3 shared issues)
-
- SARS (3 shared issues)
-
- WIV (3 shared issues)
-
- Wuhan (3 shared issues)
-
- Wuhan Institute of Virology (3 shared issues)
-
- Australia (2 shared issues)
-
- Canada (2 shared issues)
External Links
Source Context
Recovered passages from the original issue text. When the raw archive preserved outbound links inside the source passage, they are listed directly under the quote.
Mongol caravans introduced the plague to rats and then spread the plague rats across the world. The plague probably existed in say Yunnan, China where the locals had developed a complex set of myths and traditions to say, not eat rats. When the Mongols came they trapped the rats, got the plague and spread it in China. China’s population decreased from around 123 million in 1200 to around 65 million in 1331. Then the Mongols brought the rats to Europe 1346.
Photograph of the famous Latané and Darley experiment, cerca 1968. So, what could those participants have been thinking? Maybe something like: Hmm, why’s the room filling up with smoke? Is this a problem? *looks around the room* Well nobody else seems to care, so I guess not. Looking back at the early stages of the COVID-19 pandemic, I think maybe this is why so many of us didn’t think twice about the location of the initial outbreak. Hmm, is it kinda suspicious that this virus broke out near a major virology institute that works on bat coronaviruses? Should we maybe look into that? *looks around* Well nobody else seems to think so, so I guess not. I can’t speak for everyone else, but this was at least my mindset. I had vaguely heard something about how there was a virology research institute close to where the pandemic broke out, and that some conspiracy theorists were claiming it was the source of the virus. I looked around and noticed that nobody was really taking this idea seriously, so I figured I didn’t need to take it seriously either. Also, I was thinking something like: Eh, probably every major city has labs and research institutes doing this kind of research. And I’ll bet they purposely built the virology institute close to where these viruses occur in nature, to give them easy access for sampling. Well, it turns out both of these things are wrong. The type of research conducted at the Wuhan Institute of Virology (WIV) is pretty rare and specialized. It includes things like creation of chimeric coronaviruses [1, 2], infecting humanized mice with bat coronaviruses, and other types of gain of function research, which Chan and Ridley devote a chapter to. The WIV is one of only a few institutions in the world doing this type of research. It’s not the case, as I had assumed, that every major university has a couple labs doing similar work. So it does seem like a pretty remarkable coincidence that the outbreak happened in Wuhan. But maybe they purposely built the Wuhan Institute of Virology close to where these viruses are found in nature? Well, this also turns out to be wrong. The areas where viruses most similar to SARS-CoV-2 are found in nature are Yunnan province and Laos, which are more than a thousand kilometers away from Wuhan. The authors put this distance in perspective by noting that it’s more than the distance between Orlando and NYC. Image source: https://www.bloomberg.com/news/features/2020-12-30/china-is-making-it-harder-to-solve-the-mystery-of-how-covid-began If SARS-CoV-2 originated in an animal somewhere around the Yunnan / Laos area, how did it make it all the way to Wuhan without leaving a trail along the way? 4. The story of RaTG13 Although I enjoyed the book, I do have one pretty major criticism. The authors repeatedly make the claim that a virus called RaTG13, which was being studied at the WIV before the pandemic, is the closest known genetic match to SARS-CoV-2. But this claim is outdated and no longer correct. In September 2021 researchers identified a virus called BANAL-52 in Laos that’s a 96.8% match to SARS-CoV-2, closer than RaTG13’s 96.2% match. (Important note: a 96.8% match is still a long way off in genomic space, and does not imply that this is the same virus as SARS-CoV-2, or even necessarily a progenitor.) At first I thought maybe the authors didn’t mention BANAL-52 because it was discovered after the book was published, but this isn’t the case – Viral was published November 16, 2021, nearly two months after the discovery of BANAL-52 was published. Although I’m writing an overall-positive review here, I don’t want to go easy on the book where serious criticism is warranted. It’s completely unacceptable that BANAL-52 wasn’t mentioned. Even if it would have been inconvenient from a publishing standpoint, the authors should have rewritten the RaTG13 chapter, or at least included an addendum about the discovery of BANAL-52. With that being said, I think the story of RaTG13 is still interesting and important, so I’ll give a quick summary here. At the start of the pandemic in 2020, SARS-CoV-2 was quickly sequenced, and the full genome sequence was published by Dr. Shi Zhengli’s team at the WIV. In this paper, they also briefly mentioned that the genome was a 96.2% match with another bat coronavirus called RaTG13 – the closest known match at the time. Oddly, the mention of RaTG13 did not include any reference, footnote, or link to any previously published sequence. Although the WIV didn’t provide details on this mysterious RaTG13 virus, a group of internet volunteers, including both amateurs as well as professional scientists working in their free time, began to investigate. This loose collection of open-source researchers, called DRASTIC, uncovered a medical thesis describing an outbreak of a mysterious disease in 2012. Six men who had been working in a bat-infested mine in Mojiang County, China, fell ill and were admitted to a hospital with symptoms including dry coughs, shortness of breath, fevers, muscle aches, headaches, and fatigue. Three of the men eventually died of this mysterious illness. In the years following this incident, teams of researchers (including a team led by Dr. Shi Zhengli of the WIV) were sent to investigate the cause of this illness and collect samples from the Mojiang mine. This sampling led to the discovery of a novel SARS-like coronavirus in 2013, and a part of its genomic sequence was published under the name BtCoV/4991 in 2016. The DRASTIC researchers discovered that RaTG13 was genetically identical to the BtCoV/4991 sequence from the Mojiang mine – it was the same virus, and had just been renamed for some reason, without any public record of the change. They also discovered that at least eight other closely related coronaviruses were also sampled from this mine and brought to the WIV. Although unhelpful throughout the investigation, the WIV eventually verified these facts when pressed on them, and an addendum was added to the original paper confirming DRASTIC’s account of the origin of RaTG13. So what should we make of this? Well, as I mentioned before, RaTG13 is no longer the closest known genetic match to SARS-CoV-2, so maybe the whole story is less important as it pertains to the origin of the pandemic. But the discovery of BANAL-52 doesn’t really resolve things either [2]. Laos is very far away from Wuhan (actually even further than Yunnan), so we’re left with the same question as before – how did SARS-CoV-2 make it all the way to Wuhan from such a distant natural reservoir without leaving a trail along the way? 5. Lack of institutional transparency and competence A lot of the book is devoted to criticizing the Chinese government’s lack of transparency during the pandemic. Some brief examples: In the early days of the initial outbreak in Wuhan, hundreds of people were investigated and punished for the crime of “spreading rumors”. This included whistleblowing doctors who attempted to warn others [3] about the spread of the disease and its human-to-human transmission, which was being denied by the Chinese government at the time.
Inline links: 1, 2, infecting humanized mice with bat coronaviruses, https://substackcdn.com/image/fetch/$s_!6khv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fac0a8f26-8f26-4712-b245-797d478ae385_1246x642.png, https://www.bloomberg.com/news/features/2020-12-30/china-is-making-it-harder-to-solve-the-mystery-of-how-covid-began, virus called BANAL-52, published by Dr. Shi Zhengli’s team at the WIV., addendum
Image source: https://www.bloomberg.com/news/features/2020-12-30/china-is-making-it-harder-to-solve-the-mystery-of-how-covid-began If SARS-CoV-2 originated in an animal somewhere around the Yunnan / Laos area, how did it make it all the way to Wuhan without leaving a trail along the way? 4. The story of RaTG13 Although I enjoyed the book, I do have one pretty major criticism. The authors repeatedly make the claim that a virus called RaTG13, which was being studied at the WIV before the pandemic, is the closest known genetic match to SARS-CoV-2. But this claim is outdated and no longer correct. In September 2021 researchers identified a virus called BANAL-52 in Laos that’s a 96.8% match to SARS-CoV-2, closer than RaTG13’s 96.2% match. (Important note: a 96.8% match is still a long way off in genomic space, and does not imply that this is the same virus as SARS-CoV-2, or even necessarily a progenitor.) At first I thought maybe the authors didn’t mention BANAL-52 because it was discovered after the book was published, but this isn’t the case – Viral was published November 16, 2021, nearly two months after the discovery of BANAL-52 was published. Although I’m writing an overall-positive review here, I don’t want to go easy on the book where serious criticism is warranted. It’s completely unacceptable that BANAL-52 wasn’t mentioned. Even if it would have been inconvenient from a publishing standpoint, the authors should have rewritten the RaTG13 chapter, or at least included an addendum about the discovery of BANAL-52. With that being said, I think the story of RaTG13 is still interesting and important, so I’ll give a quick summary here. At the start of the pandemic in 2020, SARS-CoV-2 was quickly sequenced, and the full genome sequence was published by Dr. Shi Zhengli’s team at the WIV. In this paper, they also briefly mentioned that the genome was a 96.2% match with another bat coronavirus called RaTG13 – the closest known match at the time. Oddly, the mention of RaTG13 did not include any reference, footnote, or link to any previously published sequence. Although the WIV didn’t provide details on this mysterious RaTG13 virus, a group of internet volunteers, including both amateurs as well as professional scientists working in their free time, began to investigate. This loose collection of open-source researchers, called DRASTIC, uncovered a medical thesis describing an outbreak of a mysterious disease in 2012. Six men who had been working in a bat-infested mine in Mojiang County, China, fell ill and were admitted to a hospital with symptoms including dry coughs, shortness of breath, fevers, muscle aches, headaches, and fatigue. Three of the men eventually died of this mysterious illness. In the years following this incident, teams of researchers (including a team led by Dr. Shi Zhengli of the WIV) were sent to investigate the cause of this illness and collect samples from the Mojiang mine. This sampling led to the discovery of a novel SARS-like coronavirus in 2013, and a part of its genomic sequence was published under the name BtCoV/4991 in 2016. The DRASTIC researchers discovered that RaTG13 was genetically identical to the BtCoV/4991 sequence from the Mojiang mine – it was the same virus, and had just been renamed for some reason, without any public record of the change. They also discovered that at least eight other closely related coronaviruses were also sampled from this mine and brought to the WIV. Although unhelpful throughout the investigation, the WIV eventually verified these facts when pressed on them, and an addendum was added to the original paper confirming DRASTIC’s account of the origin of RaTG13. So what should we make of this? Well, as I mentioned before, RaTG13 is no longer the closest known genetic match to SARS-CoV-2, so maybe the whole story is less important as it pertains to the origin of the pandemic. But the discovery of BANAL-52 doesn’t really resolve things either [2]. Laos is very far away from Wuhan (actually even further than Yunnan), so we’re left with the same question as before – how did SARS-CoV-2 make it all the way to Wuhan from such a distant natural reservoir without leaving a trail along the way? 5. Lack of institutional transparency and competence A lot of the book is devoted to criticizing the Chinese government’s lack of transparency during the pandemic. Some brief examples: In the early days of the initial outbreak in Wuhan, hundreds of people were investigated and punished for the crime of “spreading rumors”. This included whistleblowing doctors who attempted to warn others [3] about the spread of the disease and its human-to-human transmission, which was being denied by the Chinese government at the time.
Lineage A (left) was used by the Minoan Cretans, but has never been deciphered. Lineage B (right) was used by the Mycaeneans for lists of palace goods. This matches Saar’s story above. The lab leaked to somewhere else in Wuhan, not the wet market. The virus spread undetected in the population for a while. During this time, it mutated to Lineage B. Then one of the people with Lineage B went to the wet market and started a superspreader event. The authorities sampled the patients, found Lineage B, then started looking elsewhere. Later they detected some of the earlier Lineage A cases. The market is unlikely to be the origin of the pandemic, because the original Lineage A strain wasn’t found there. Peter: Although Lineage A is evolutionarily older, Lineage B started spreading in humans first. We know this because Lineage B is more common. Throughout the early pandemic, until the D614G variant drove all other strains extinct, a consistent 2/3 of the cases were B, compared to 1/3 A. Both strains spread at the same rate, so the best explanation is that B started earlier than A. Since COVID doubles every 3-4 days, probably Lineage B started 3-4 days earlier than Lineage A, which explains why it’s always been twice as many cases. But also, Lineage B also has more internal genetic diversity than Lineage A. In general, older viruses have more genetic diversity (the “molecular clock”). This is further evidence that B started spreading first. Pekar 2022 and Pipes 2021 do analyses with known parameters for spread rate and diversity, and find 90%+ odds that Lineage B was the first one in humans. Why did the older strain start spreading later? Probably the virus crossed from bats into raccoon-dogs on some raccoon-dog farm out in the country. It spread in the raccoon-dogs for a while, racking up mutations, including the (less mutated) Lineage A strain and the (slightly more mutated) Lineage B strain. Then several raccoon-dogs were taken to Wuhan for sale, including one with Lineage A and another with Lineage B. The one with Lineage B passed its virus to humans earlier. Then 3-4 days later, the Lineage A one passed its virus to humans. Lineage A was first found in a Wuhan neighborhood right next to the wet market (closer to the wet market than 97% of Wuhan’s population). Again, it would be a bizarre coincidence if a lab leak pandemic was first detected at a wet market. But it would be an even more bizarre coincidence if a lab leak pandemic separated into two strains, and both were first detected at a wet market! Although no known wet market cases were Lineage A, a positive Lineage A environmental sample was found at the wet market, and everyone agrees most cases went undetected. So maybe the Lineage B raccoon-dog spread its virus to a vendor, and that sub-strain mostly stayed in the market. But the Lineage A raccoon-dog spread its virus to a customer, who went back to his house nearby, and that strain spread in the neighborhoods next to the market. This is the only story that explains the evolutionary precedence of A, the greater spread and older molecular clock of B, and the fact that both strains were first found very close to the wet market. Yuri/Saar: Lineage B could be more common and diverse because it got the advantage of a super-spreader event in the wet market. There are a few scattered cases of intermediates between A and B, and a few other scattered cases of lineages that seem even more ancestral (ie closer to the bat virus) than either. This doesn’t make sense in a double spillover hypothesis. But it does make sense if the lineages separated in human transmission somewhere between the lab and the first super-spreader event at the wet market. Peter: Again, the wet market wasn’t a super-spreader event. COVID spread in the wet market at exactly its normal spread rate, doubling about once every 3.5 days. Stop calling the wet market a super-spreader event. The scattered cases of “intermediates” are sequencing errors. They were all found by the same computer software, which “autofills” unsequenced bases in a genome to the most plausible guess. Because Lineage B was already in the software, depending on which part of a Lineage A virus you sequenced, you might get one half or the other autofilled as Lineage B, which looked like an “intermediate”. We know this because all the supposed “intermediates” were partial cases sequenced by this particular software. We can confirm this by noting that there are too many intermediates! That is, where Lineage A is (T/C) and Lineage B is (C/T), the software found both (T/T) “intermediates” and (C/C) “intermediates”. But obviously there can only be one real intermediate form, and we have to dismiss one or the other. But in fact we can dismiss both, because they were both caused by the same software bug. The scattered “progenitor” cases - those closer to the ancestral bat virus than either A or B - are reversions, ie cases where a new mutation in the virus happened to hit an already-mutated base and shift it back towards the ancestral virus. We know this because all of these “progenitors” were scattered cases found months after the pandemic started, often in entirely different countries from Wuhan. If these were real progenitor viruses, they would have either fizzled out or exploded into a substantial portion of all cases, not be found one time in one guy in Malaysia. Given the number of mutations the virus developed over the course of the pandemic, it’s inevitable that some of them would be mutations that bring it closer to the original bat virus, and in fact we find the number of “progenitors” found very nicely matches the number of progenitor-appearing viruses we would expect by chance. And in many cases, we know the “progenitors” are newer than the original lineages, because they also have some of the later mutations that Lineage A or B picked up along the way, alongside their apparent ancestral-bat-virus-like mutations. Session 2: Viral Genetics Yuri: Two years before COVID, scientists at the Wuhan Institute of Virology, together with colleagues at the University of North Carolina, sent in a grant proposal for the DEFUSE program. This program, intended to locate and better understand potential future pandemic viruses, involved going into bat caves and collecting new coronaviruses. Once they had them, they would do gain-of-function: specifically, they would add a furin cleavage site to make them more infectious and see what happened. (quick interlude: COVID’s spike protein has two sections: one binds to human cells through the ACE2 receptor, the other helps fuse with the cell after binding. In order to avoid the immune system, it hides both of these into one spike. But when it reaches a cell, it needs to separate them again. It takes advantage of a human respiratory enzyme, furin, to do the separation - this also ensures that it only infects its primary target, human respiratory cells. The part of COVID that lets it get separated by furin is called the “furin cleavage site”. COVID’s bat-virus ancestors were gastrointestinal viruses; the addition of a furin cleavage site was what made them respiratory viruses.) We’ve found two close relatives of COVID: bat viruses called RATG-13 and BANAL-52. In particular, COVID looks more or less like BANAL-52 plus a furin cleavage site. There are 1500 sarbecoviruses, members of the family of viruses that includes SARS and SARS2/COVID. None of them except COVID have furin cleavage sites. BANAL-52, COVID’s closest ancestor, doesn’t even have anything resembling one that could mutate into a functional furin cleavage site like COVID’s. Instead, COVID - which mostly just resembles BANAL-52 with a few scattered single-point mutations - has twelve completely new nucleotides in a row - a fully formed furin cleavage site that came out of nowhere. There is nowhere else in the genome that COVID differs from BANAL-52 in such a profound way. It’s just BANAL-52 plus a little bit of random mutation plus a fully-formed furin cleavage site that came out of nowhere. Further, the furin cleavage site is weird. It uses the protein arginine twice. But instead of the nucleotides coding for arginine in the usual viral way, both times it uses the codons CGG - the way that higher animals code for arginine. This works fine - it’s just not how viruses do it. So the obvious conclusion is that WIV, which said in 2018 that it was going to find viruses and add furin cleavage sites to them, found a close relative of BANAL-52 and added a furin cleavage site. Since they were humans, and most familiar with the human way of encoding arginine, they added it as CGG both times. COVID seemed surprisingly optimized for infecting humans. Of fifty animals it was tested in, including the usual coronavirus intermediate hosts (pangolins, raccoon-dogs, etc), it was best at infecting human cells. Further, a virus that enters a new species will usually show a burst of mutations as it “figures out” the best way to adapt to that species’ unique biology. But COVID has had a pretty constant mutation rate in humans, from the beginning of the pandemic to the end. That suggests it was already adapted to humans. This could be because the lab screened for viruses with existing adaptations, because they passed it through humanized mice in the lab, or because it adapted in the hundreds of undetected cases that happened between the lab and detection in the wet market. Usually, research with potentially dangerous coronaviruses is done in BSL-3 or 4, ie high to very-high security. But WIV was irresponsibly doing it in BSL-2, ie medium security. The researchers weren’t even required to wear masks. In general, about 1/500 labs will leak any given pathogen they’re working on (?!). But because WIV was researching such an infectious virus in such an irresponsible way, the odds of a leak were much higher. The most likely explanation for all these facts is that WIV went ahead and did the gain-of-function research they said they were going to do (the particular DEFUSE grant proposal we know about got rejected, but it proves that Wuhan wanted to do this, and they could easily have gotten funding somewhere else, or done it out of their regular budget). They found a close relative of BANAL-52 and added a furin cleavage site as a simple twelve-nucleotide insertion, using the human method of encoding arginine that their genetic engineers were familiar with. Then it leaked, spread for a while in the general Wuhan population, and eventually made it to the wet market where it got detected. Peter: As mentioned earlier, the DEFUSE grant was rejected. Further, the grant said that the Wuhan Institute of Virology was responsible for finding the viruses, and the University of North Carolina would do all the gain-of-function research. This was a reasonable division of labor, since UNC was actually good at gain-of-function research, and WIV mostly wasn’t. They had done a few very simple gain-of-function projects before, but weren’t really set up for this particular proposal and were happy to leave it for their American colleagues. Even if WIV did try to create COVID, they couldn’t have. As Yuri said, COVID looks like BANAL-52 plus a furin cleavage site. But WIV didn’t have BANAL-52. It wasn’t discovered until after the COVID pandemic started, when scientists scoured the area for potential COVID relatives. WIV had a more distant COVID relative, RATG-13. But you can’t create COVID from RATG-13; they’re too different. You would need BANAL-52, or some as-yet-undiscovered extremely close relative. WIV had neither. Are we sure they had neither? Yes. Remember, WIV’s whole job was looking for new coronaviruses. They published lists of which ones they had found pretty regularly. They published their last list in mid-2019, just a few months before the pandemic. Although lab leak proponents claimed these lists showed weird discrepancies, this was just their inability to keep names consistent, and all the lists showed basically the same viruses (plus a few extra on the later ones, as they kept discovering more). The lists didn’t include BANAL-52 or any other suitable COVID relatives - only RATG-13, which isn’t close enough to work. Could they have been keeping their discovery of BANAL-52 secret? No. Pre-pandemic, there was nothing interesting about it; our understanding of virology wasn’t good enough to point this out as a potential pandemic candidate. WIV did its gain-of-function research openly and proudly (before the pandemic, gain-of-function wasn’t as unpopular as it is now) so it’s not like they wanted to keep it secret because they might gain-of-function it later. Their lists very clearly showed they had no virus they could create COVID from, and they had no reason to hide it if they did. COVID’s furin cleavage site is admittedly unusual. But it’s unusual in a way that looks natural rather than man-made. Labs don’t usually add furin cleavage sites through nucleotide insertions (they usually mutate what’s already there). On the other hand, viruses get weird insertions of 12+ nucleotides in nature. For example, HKU1 is another emergent Chinese coronavirus that caused a small outbreak of pneumonia in 2004. It had a 15 nucleotide insertion right next to its furin cleavage site. Later strains of COVID got further 12 - 15 nucleotide insertions. Plenty of flus have 12 to 15 nucleotide insertions compared to other earlier flu strains. Sometimes insertions happen because of a mistake in viral replication. Other times the virus gets confused between its own RNA and its host’s, and splices a bit of the host RNA into the virus. This would neatly explain why the insertion used the unusual coding CGG for arginine, which is common in animals but rare in viruses. On the other hand, it’s not that rare in viruses - COVID uses CGG for arginine about 3% of the time. And human engineers don’t necessarily use it any more than that - Peter was able to find one example of humans adding arginine to a virus, and 0 out of the 5 arginines added were CGG. COVID’s furin cleavage site is a mess. When humans are inserting furin cleavage sites into viruses for gain-of-function, the standard practice is RRKR, a very nice and simple furin cleavage site which works well. COVID uses PRRAR, a bizarre furin cleavage site which no human has ever used before, and which virologists expected to work poorly. They later found that an adjacent part of COVID’s genome twisted the protein in an unusual way that allowed PRRAR to be a viable furin cleavage site, but this discovery took a lot of computer power, and was only made after COVID became important. The Wuhan virologists supposedly doing gain-of-function research on COVID shouldn’t have known this would work. Why didn’t they just use the standard RRKR site, which would have worked better? Everyone thinks it works better! Even the virus eventually decided it worked better - sometime during the course of the pandemic, it mutated away from its weird PRRAR furin cleavage site towards a more normal form. Further, COVID’s furin cleavage site was inserted via what seems to be a frameshift mutation - it wasn’t a clean insertion of the amino acids that formed the site, it was an insertion of a sequence which changed the context of the surrounding nucleotides into the amino acids that formed the site. This is a pointless too-clever-by-half “flourish” that there would be no reason for a human engineer to do. But it’s exactly the kind of weird thing that happens in the random chance of evolution. COVID is hard to culture. If you culture it in most standard media or animals, it will quickly develop characteristic mutations. But the original Wuhan strains didn’t have these mutations. The only ways to culture it without mutations are in human airway cells, or (apparently) in live raccoon-dogs. Getting human airway cells requires a donor (ie someone who donates their body to science), and Wuhan had never done this before (it was one of the technologies only used at the superior North Carolina site). As for raccoon-dogs, it sure does seems suspicious that the virus is already suited to them. The claim that COVID is uniquely adapted to humans is false. The paper that claimed that defined how well COVID was adapted to different animals by those animals’ difference (on the relevant cell receptors) from humans. So in its methodology, humans came out #1 by default. If you don’t do that, COVID is better-adapted to many other animals. It’s not necessarily true that viruses see a burst of mutations when they enter a new host. COVID spread to deer and mink, and in neither case was there a burst of mutations. COVID has a pretty simple job of infecting respiratory cells and is already very good at it, regardless of species. In Yuri’s model, Wuhan Institute of Virology picked up a discarded grant and decided to do the gain-of-function half allotted to a different university, despite their relative inexperience. They skipped over all the SARS-like viruses they were supposed to work on, and all the standard gain-of-function model backbones, in favor of BANAL-52, a virus which would not be discovered for another two years, but which they somehow had samples of, which they had for some reason decided to keep secret despite its total lack of interestingness. Then they would have had to eschew all usual gain-of-function practices in favor of inserting a weird furin cleavage site that shouldn’t have worked according to the theory they had at the time, via a frameshift mutation. Then they would have had to culture it, a technique beyond their limited capabilities. Then it would have had to leak, and magically show up again in front of the raccoon-dog stall at a wet market. Yuri: WIV wouldn’t have needed to keep BANAL-52 “secret” in some kind of sinister way. Plenty of researchers have backlogs of work they haven’t published yet. Probably they a found BANAL relative in one of their normal sampling trips, did some preliminary studies on it, and planned to publish it later once they cleaned up their data. Everyone works like this. The part of DEFUSE saying that they would only work on viruses that were 95% similar to SARS is unclear and might mean something else. It looks more like they say they’ll start with those viruses, but also do some work on novel viruses. BANAL-52 could have been one of the novel viruses. The furin cleavage site is weird, but the researchers might have done that on purpose, to make the virus easier to keep track of, or to test different furin cleavage sites. Depending on the exact BANAL-52 relative they used, it might not even be a frameshift; there’s a particular way to spell serine that would make the insertion more natural. The claims that COVID can’t be cultured in normal media are based on speculative original research by Peter and might not hold up. Peter: WIV did most of its virus-gathering in a trip to a Yunnan cave between 2010 and 2015. All those viruses have long since been processed and added to the database. There’s no sign that they made more trips to Yunnan caves, and no reason for them to keep that secret. So the idea that they might just have some new viruses they didn’t publish doesn’t hold up. But suppose they did make more trips. Given the amount of time between the DEFUSE proposal and COVID, if they kept to their normal virus-collection rate, they would have gotten about thirty new viruses. What’s the chance that one of those was BANAL-52? There are thousands of bat viruses, and BANAL-52 is so rare that it wasn’t found until well after the pandemic started and people were looking for it very hard. So the chance that one of their 30 would be BANAL-52 is low. Also, they said in DEFUSE that they planned to go back to the same Yunnan cave. But BANAL-52 was found far away from that cave, so unless it ranged over a wide area, they probably couldn’t have found it even if they got very lucky. Session 3: Closing Arguments This third debate was supposed to be about “inference”, ie how much Bayesian evidence was provided by each of the facts given so far, and how to fit them into the Rootclaim probabilistic model. I’m going to relegate my summary of the more probabilistic half to the next section of this post, and just include the closing arguments here. Saar: Peter’s case hinges on the idea that it’s very improbable that a lab leak pandemic would first show up at a wet market. But this isn’t necessarily improbable. The Huanan Seafood Market had several factors that made it a likely location for a superspreader event. It was busy, with over 10,000 visitors a day. Many of the people there (eg the 1,000 vendors) came back daily, letting them reinfect each other. It had poor ventilation, especially in the high-positivity area near the raccoon-dog stall. It had cold wet surfaces on which the virus could survive for long periods. It was indoors, which prevented UV light from killing the virus. Given a small amount of sporadic COVID going around Wuhan, it’s not surprising for the first place it started spreading en masse to be a wet market. In fact, we have several examples of this. When China was COVID Zero, there would occasionally be small outbreaks that the authorities would have to contain. Most of these were at wet markets. For example, the big COVID outbreak in Beijing started at Xinfadi Market, their local seafood market. This couldn’t be an animal spillover, because there were no raccoon-dogs or other weird wildlife there. So it must be that wet markets are natural places for superspreader events. There are several other examples, which make up about half of the total outbreaks in Zero COVID era China, plus others in Singapore and Thailand. Since COVID clusters concentrate in wet markets even when there is no animal spillover, we should accept this as a property of the virus, and not attribute any significance to the fact that this happened in Wuhan too. Peter: About 1/10,000 citizens of Wuhan was a wet market vendor. So there’s a 1/10,000 chance that the first known COVID case should be a wet market vendor by chance alone. Weibo lists the most popular places for people to check in to their network on their phones, and the wet market was the 1600th most popular place in Wuhan, meaning that if you weight locations by busy-ness, there’s a less than 1/1600 chance that the first cases would be in the wet market. Yes, the wet market is indoors, has mediocre ventilation, has repeat visitors, etc. So do thousands of other places in Wuhan, like schools, hospitals, workplaces, places of worship. The wet market isn’t special in any way. And again, it wasn’t a superspreader event! COVID spread at the same rate in the wet market as it does everywhere else: doubling once per 3.5 days. It doesn’t matter what kinds of arguments you can come up with for why the wet market should have been the perfect superspreader event location, we can look at it and see that it wasn’t. It’s an environment that spreads COVID at exactly the normal rate. Zero COVID era Chinese outbreaks were concentrated in wet markets because they received infected animal products. We know why there was an outbreak in the Xinfadi Market in Beijing: it was because the seafood stall got frozen fish from some non-Zero-COVID country, the fish had COVID particles on it, and the vendor got infected and spread it to everyone else. Something like this is true for the other Chinese wet market based outbreaks we know about it. So this makes the opposite point you think it does: wet markets start outbreaks because there are infected goods being sold there. Then the virus spreads through the wet market at a completely normal rate. Saar: The Weibo list of 1600 places bigger than the wet market is likely inaccurate, because it's based on check-in data and people don't check in to seafood markets. Most of those 1600 places aren't amenable to superspread. The 70 markets supposedly bigger than Huanan are irrelevant, because they're supermarkets, open air markets, etc. Huanan is the largest seafood market in central China, and a more likely place for the first cluster of cases to be noticed. Markets weren't a common spillover location in SARS1, so the zoonosis hypothesis hasn't "called" this event in a way that should give them a high Bayes factor. And there’s still plenty of evidence for isolated (though not super-spreading) pre-market cases. A British expatriate in Wuhan, Connor Reed, says he got sick in November, three weeks before the first wet market case. Later the hospital tested his samples and said it was COVID. Another paper reports 90 cases before the first wet market one. Peter: Connor Reed was lying. The case wasn’t reported in any peer-reviewed paper. It was reported in the tabloid The Daily Mail, months after it supposedly happened. He also told the Mail that his cat died of coronavirus too, which is rare-to-impossible. Also, to get a positive hospital test, he would have had to go to the hospital, but he was 25 years old and almost no 25-year-olds go to the hospital for coronavirus. His only evidence that it was COVID was that two months later, the hospital supposedly “notified” him that it was. The hospital never informed anyone else of this extremely surprising fact which would be the biggest scientific story of the year if true. So probably he was lying. Incidentally, he died of a drug overdose shortly after giving the Mail that story; while not all drug addicts are liars, given all the other implausibilities in his story, this certainly doesn’t make him seem more credible. And in any case, he claimed he got his case at a market “like in the media” The other 90 cases are also fake. A lab leak guy found a paper that mentioned 90 more cases than other papers, and made up a conspiracy theory where the author was trying to secretly communicate that there had been 90 secret cases before any of the confirmed cases, even though there was nothing about this in the text of the paper. But actually that paper just counted cases differently than other papers, and they were referring to normal cases after the pandemic officially started. Again, I’ll come back to the discussion about inference later, but for now, here’s a table of both sides’ reasoning. This exact presentation comparing both analyses is mine3, but you can see Saar’s version here, and Peter’s starting at 45:33 of this video. Slightly made up; the two sides didn’t express their probabilities in the same way and I had to make editorial decisions to match them. Note that these aren't entirely comparable because Peter is being laxer about out-of-model probability than Saar. Although Saar's final odds here are 533-to-1, this just the central estimate. Rootclaim’s real final probability is 94% lab leak. You can see their analysis here. And The Winner Is . . . … … … … … Peter and the zoonosis hypothesis. This was a decisive victory. There were two judges, who each gave separate verdicts (or were allowed to declare a draw). Both judges decided in favor of Peter. You can see the judges’ own summary of their reasoning here (Will, Eric) Manifold agreed with the judges. There was a prediction market on who would win. It started out 70-30 in favor of lab leak. As the videos came out, zoonosis started doing better and better. I don’t want to take the exact final numbers too seriously, since I think some of the later price increases involved hints from the participants’ behavior. But it’s clear which way viewers thought the wind was blowing4. Around the same time, the Good Judgment Project - Philip Tetlock’s group studying superforecasters - put out a report on the lab leak hypothesis. After studying it in depth, his forecasters ended up 75-25 in favor of zoonosis. The Rootclaim debate was one of ten sources they said they found especially interesting. And also around the same time, and unrelated to any of this, the Global Catastrophic Risks Institute surveyed experts (“168 virologists, infectious disease epidemiologists, and other scientists from 47 countries”) and found the same thing (though see here for some potential problems with the survey): For what it’s worth, I was close to 50-50 before the debate, and now I’m 90-10 in favor of zoonosis. III. The Math And The Aftermath The third debate session was about “inference”, how to put evidence together. I put this part off until after disclosing the winner, because I wanted to talk about some of these issues at more length. The Math: Judges Both judges included a probabilistic analysis in their written decision. Here’s the same table as above, expanded to add the judges: I shoehorned the judges’ factors into the categories I already had; some of them were actually subtly different from Peter’s, Saar’s, and each other’s. The “priors” category is especially a mess here. We’ll go over these later, but I get the impression that they both thought of probabilistic analyses as an afterthought. For example, Judge Eric wrote 30,000 words about which considerations moved him, and only then includes the analysis, saying: I am not convinced that this Bayesian calculation is even an appropriate way to estimate the relative posterior probability of Z and LL; it just seemed fair that after criticizing Rootclaim’s calculations at length I should make an attempt at it myself. Judge Will’s decision ran to 10,000 words. He said he independently tried both reasoning it out intuitively, and running the Bayesian analysis, and was relieved when these two methods returned the same result. He said: I am skeptical that the Bayesian decision making/evaluation methods are any more "objective" than [intuitive reasoning]. I think they maximize legibility, not objectivity, and tend to hide the intuitive/heuristic portion in the data inclusion step and values, where it’s harder to see . . . I am not skilled in the Bayesian method, and I am sure I made significant mistakes. More time and practice would improve and refine my estimates. At the fundamental rules of the universe level, Bayesian analysis must be the best way to evaluate evidence. However, I am unsure that it’s a good strategy for a human given our cognitive limitations, and doubly unsure it’s truly being used (in the dispassionate sense) where the outcome is social desirability/fame/Twitter likes. I’m focusing on this because Saar’s opinion is that the debate went wrong (for his side) because he didn’t realize the judges were going to use Bayesian math, they did the math wrong (because Saar hadn’t done enough work explaining how to do it right), and so they got the wrong answer. I want to discuss the math errors he thinks the judges made, but this discussion would be incomplete without mentioning that the judges themselves say the numbers were only a supplement for their intuitive reasoning. That having been said, let’s look deeper into some of Saar’s concerns. The Math: Extreme Odds Saar complained that Peter’s odds were too extreme. For example, Peter said there was only a 1/10,000 chance that a lab leak pandemic would first show up at a wet market. Peter’s argument went something like: obviously a zoonotic pandemic would start at a site selling weird animals. But a lab leak pandemic - if it didn’t start at the lab - could show up anywhere. 1/10,000 Wuhan citizens work at the wet market. So if a lab leak was going to show up somewhere random, the wet market was a 1/10,000 chance. Saar had specific arguments against this, but he also had a more general argument: you should rarely see odds like 1/10,000 outside of well-understood domains. In his blog post, he gave this example: A prosecutor shows the court a statistical analysis of which DNA markers matched the defendant and their prevalence, arriving at a 1E-9 probability they would all match a random person, implying a Bayes factor near 1E9 for guilty. But if we try to estimate p(DNA|~guilty) by truly assuming innocence, it is immediately evident how ridiculous it is to claim only 1 out of a billion innocent suspects will have a DNA match to the crime scene. There are obviously far better explanations like a lab mistake, framing, an object of the suspect being brought by someone to the scene, etc. So the real p(wet market|lab leak) isn’t the 1/10,000 chance a pandemic arising in a random place hits the wet market, but the (higher?) probability that there’s something wrong with Peter’s argument. Then Saar tried to show specific things that might be wrong with Peter’s argument. I didn’t find his specific examples convincing. But maybe the question shouldn’t be whether I agreed with him. It should be whether I’m so confident he’s wrong that I would give it 10,000-to-1 odds. This makes total sense, it’s absolutely true, and I want to be really, really careful with it. If you take this kind of reasoning too far, you can convince yourself that the sun won’t rise tomorrow morning. All you have to do is propose 100 different reasons the sunrise might not happen. For example: The sun might go nova.
Inline links: Pekar 2022, Pipes 2021, says, 3, here, this video, https://substackcdn.com/image/fetch/$s_!8aU2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815fe32d-d7ea-401b-b3a2-d8cd25b52ee8_490x780.png, https://substackcdn.com/image/fetch/$s_!0Tm_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0492f69-7b7e-4611-9d76-64ef8d7f59d5_511x511.png, Will, Eric, agreed, 4, put out a report on the lab leak hypothesis, https://substackcdn.com/image/fetch/$s_!g7k2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37f1b493-b556-41ec-925e-03f9d8bc26cb_1456x849.webp, surveyed experts, see here, https://substackcdn.com/image/fetch/$s_!Zejl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9c88e87-b6ca-4c6d-840e-24da726f50b7_975x365.png, https://substackcdn.com/image/fetch/$s_!T5rV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4983e2cd-4151-42de-9685-08037ef7a8e8_635x788.png
1. Xiao et al (2021) - https://www.nature.com/articles/s41598-021-91470-2%E2%80%8B%E2%80%8B%E2%80%8B , which includes a co-author of Worobey et al (2022), a leading zoonosis paper states in table 1 that the raccoon dogs were wild caught in Hubei, not farmed as you assert in the piece. This alone rules out raccoon dogs as plausible hosts for two independently sufficient reasons. Firstly, there is unanimity in the literature that the bat ancestral virus to SARS-CoV-2 is in southern Yunnan or South East Asia. Everyone agrees with this, including Shi Zhengli. If a species was wild caught in Hubei, then there would be no explanation of how it acquired the ancestral bat virus, given that Hubei is 1000 miles from southern Yunnan.
This alone isn’t fatal to lab leak. It’s perfectly possible for the lab to leak (let’s say) November 5th, the virus spreads a bit, and then a month later someone goes to the wet market, coughs on a vendor, and starts the officially recognized pandemic. But if that were true, you’d expect (let’s say) 30 cases by early December. Let’s say the wet market vendor was exactly Case # 30. She infected the other wet market vendors, starting a pandemic with an obvious center at the wet market and lots of infected wet market vendors and patrons. What about Case # 29? If they were (let’s say) a barista, how come they didn’t infect people at their coffee shop? How come there wasn’t a second obvious cluster radiating out from a coffee shop, lots of coffee-shop-linked cases, etc? How come there weren’t 30 equally-sized clusters? In order to avoid this, you either need to claim that the wet market was a perfect superspreader location, or that the pattern with lots of cases in the wet market and few-to-none anywhere else was a result of ascertainment bias. Saar made both those arguments during the debate, but I thought Peter rebutted them effectively. 1.4: COVID in Brazilian wastewater Nicholas Halden (blog) writes: What should we make of this study, which found the presence of covid in Brazilian wastewater in late 2019? Consider the doubling times. The study says that scientists working in late 2020 found COVID in samples of Brazilian wastewater from November 27, 2019. This was long before the first detected case of transmission in Brazil on March 13, 2020. Between November 27, 2019 and March 13, 2020 is about 16 weeks, so 32 COVID doubling times. 32 doubling times with no lockdown is enough time for COVID to infect every single person in Brazil. If COVID had infected everyone in Brazil before the first recognized case, we would have noticed. (again, COVID doubling time isn’t exactly invariably 3.5 days, but here we’re talking about numbers big enough that the exact details don’t matter very much) So if COVID was in Brazil on November 27, it must have fizzled out instead of going pandemic. How likely is that? If one person had COVID, it’s not too unlikely - not all COVID cases transmit it forward. If (let’s say) twenty people had COVID, it’s very unlikely - at that point, the law of large numbers takes over; in a freak coincidence, every single patient would have to fail to infect anyone else. So almost certainly fewer than 20 people in Brazil had COVID in November 27. So which is more likely - that somehow 20 people had COVID long before the virus was officially detected, and on a totally different continent, yet somehow a scientist looking through wastewater found the water from exactly those people and managed to detect the virus? Or that there was a sampling error, which happens all the time in these kinds of things? Peter wrote a blog post on some of these issues. He found that there were positive tests from wastewater samples as early as March 2019, which doesn’t fit anyone’s timeline, including lab leakers’. And most of these positives (including the Brazilian sample) contained later strains of the virus with mutations it picked up late in 2020. So these were almost certainly false positives from contamination. 1.5: Biorealism’s 16 arguments Biorealism has a list of sixteen arguments, which he liked so much that he posted it three times in the ACX comments, twice on Less Wrong, twice on Manifold, and about a dozen times on Twitter under multiple account names. Some posts were slightly different from others, but a typical version is: Importantly, Miller incorrectly claimed the N501Y mutation would result from passage in hACE2 mice (mixed them up with BALB/c mice). The major papers Miller relied on have been seriously challenged since the debate. See Stoyan and Chiu (2024), Weissman (2024), Bloom (2023) and Lv et al (2024). Overall the circumstantial evidence makes lab v plausible: Peter admitted getting this wrong during the debate. I think this very minor point about mice mutations was approximately his only mistake in 15 hours of debating, and he admitted it as soon as he noticed. Biorealism somehow heard about this (obviously not through watching the debate, as we’ll see in a moment), then left about 20-30 comments starting with it, under various accounts, on various platforms, as if it somehow discredited Peter. This is making me somewhat less charitable to him and his 16 arguments than I would be otherwise. 1. Chinese researchers Botao & Lei Xiao observed lab origin was likely given the nearest known relatives to SARS-CoV-2 were far from Wuhan. Wuhan Institute of Virology (WIV) sampled SARS-related bat coronaviruses where the nearest relatives are found in Yunnan, Laos and Vietnam ~1500km away. They refuse to share their records. The ancestral viruses of SARS were found equally far from where SARS spilled over into humans, so we know it’s possible (and likely) for viruses to travel that far. 2. Patrick Berche, DG at Institut Pasteur in Lille 2014-18, notes you would expect secondary outbreaks if it arose via the live animal trade. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10234839/ There are constant outbreaks of weird coronaviruses in animal handlers. See eg this paper, which estimates about 60,000 of these per year. None of these ever go anywhere, because the farmers are in rural areas that aren’t dense enough to sustain a high R0, and the epidemic fizzles out after a single digit number of cases. Any early outbreaks of COVID would have vanished into this long and mostly unnoticed list. 3. Molecular data: Only sarbecovirus with a furin cleavage site. Well adapted to human ACE2 cells. Low genetic diversity indicating a lack of prior circulation (Berche 2023). Restriction site SARS-CoV-2 BsaI/BsmBI restriction map falls neatly within the ideal range for a reverse genetics system and used previously at WIV and UNC. Ngram analysis of the codon usage per Professor Louis Nemzer https://twitter.com/BiophysicsFL/status/1667232580255490053?t=IJgitS5cw364ioclzVWxaA&s=19 The SARS2 backbone is very low in CG and CpG. While the 12-nt insert that gives it the FCS is extremely high in both. Almost as if it was some kind of chimera of a consensus sequence and a codon-optimized polybasic cleavage site? https://twitter.com/BiophysicsFL/status/1752800486837678377?t=EpIRgyybJVaPgeMP5xdstA&s=19 https://www.biorxiv.org/content/10.1101/2022.10.18.512756v1 https://link.springer.com/article/10.1007/s10311-021-01211-0?fbclid=IwAR1HMUMtLIAFOFppVasQDeoIAYrVhP8j4YoPO4wnaTOUiKLsllZl_oKryOw Most of this was discussed extensively in the second session of the debate, which I recommend. The CGG-CGG arginine codon usage is particularly unusual but used in synthetic biology. I asked a synthetic biologist about this. He said: » “Nope. I would literally never do this if I was designing a small insert (maybe I wouldn't notice if it happened by chance with ~1 in 25 odds in a naive codon optimization algorithm as part of a larger sequence). High GC% is bad. Tandem repeat is worse. Several other perfectly fine arginine codons. And I wouldn't engineer a viral genome using human codon usage. An engineer would not do it.” 4. DEFUSE full proposal: virus 20% different from SARS1, consensus seq assembled with 6 segments, without disrupting coding seq, BsmBI order, FCS. SARS2: 20% different than SARS1, 6 evenly spaced fragments w BsmBI and BsaI restriction sites, FCS. Jesse Bloom, Jack Nunberg, Robert Townley, Alexandre Hassanin have observed this workflow could have lead to SARS-CoV-2. Work often begins before funding sought or goes ahead anyway. Re: 4 - Also scattered across second section of debate, also not going to retread 5. Market cases were all lineage B. Lv et al (2024) indicates there was a single point of emergence and A came before B. So market cases not the primary cases. See also Bloom (2021), Kumar et al (2022). Peter Ben Embarek said there were likely already thousands of cases in Wuhan in December 2019.https://t.co/50kFV9zSb6 https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/34398234/ https://academic.oup.com/bioinformatics/article/38/10/2719/6553661 There was a Lineage A sample in the market, lab leak proponents just try to ignore/dismiss/conspiracize it away. The first two known Lineage A cases were very close to the market. Lv (is this even a real name? It sounds like Roman numeral? But I guess that’s what you expect in a country ruled by someone named Xi) found some weird COVID variants in Shanghai that might or might not mean anything; you can see some discussion of the implications here, but I don’t think they’re strong evidence either way. If A was first, it means some really weird stuff coincidences have to happen to give us the spread rates and genetic clock data we get, but they’re not necessarily weirder in the zoonosis hypothesis than the lab leak one. The claim that there were “thousands of cases in Wuhan in December 2019” is very easy to disprove by doubling rate arguments like the one above, by the blood bank study mentioned above, by the WHO’s failed case search, and by many other lines of argument. 6. Evidence for lineage A in the market is based on a low quality sample according to Liu et. al. (2023). I really think lab leakers need to decide whether they think China is a sinister actor trying to cover up the truth, or whether they should trust every offhand comment by Chinese government officials as gospel. Dr. Liu doesn’t explain in what sense he thinks the Lineage A sample is “low-quality”, and the Western scientists who I asked about this said they didn’t understand this complaint and that the sample was fine. A Western team re-analyzing the same sample describes it as “conclusively contain[ing] Lineage A.” I think most lab leakers have switched from trying to deny the genetics to claiming that this was “contamination”, which also doesn’t make sense (the sample is genetically very early). Note that aside from this sample, the first two Lineage A cases discovered were both very close to the wet market. 7. Bloom (2023) shows market samples do not support market origin. There is also no evidence of transmission in the claimed susceptible animals elsewhere. https://academic.oup.com/ve/advance-article/doi/10.1093/ve/vead089/7504441 Discussed extensively in my article as well as the first section of the debate. 8. Lineage A and B only two mutations apart. François Ballox, Bloom and Virginie Courtier-Orgogozo note this is unlikely to reflect two separate animal spillovers as opposed to incomplete case ascertainment of human to human transmission (Bloom 2021). Discussed extensively in my article as well as the first section of the debate. 9. Sampling bias. George Gao, Chinese CDC head at the time, acknowledged to the BBC stating they may have focused too much on and around the market and missed cases on the other side of the city. David Bahry outlines the documented bias. Michael Weissman has shown this mathematically. https://journals.asm.org/doi/10.1128/mbio.00313-23 https://academic.oup.com/jrsssa/advance-article-abstract/doi/10.1093/jrsssa/qnae021/7632556 Re: Dr. Gao, see above comment about Chinese officials. See the section Ascertainment Bias below for why I disagree with this specific claim, which also addresses the Michael Weissman argument. 10. Spatial statistics experts show the Worobey claim the market was the early epicentre was flawed. https://academic.oup.com/jrsssa/advance-article-abstract/doi/10.1093/jrsssa/qnad139/7557954 Re: 10 - See Confirmation Of The Centrality Of The Huanan Market Among Early COVID-19 Cases, a response to the paper you cite: The centrality of Wuhan's Huanan market in maps of December 2019 COVID-19 case residential locations, established by Worobey et al. (2022a), has recently been challenged by Stoyan and Chiu (2024, SC2024). SC2024 proposed a statistical test based on the premise that the measure of central tendency (hereafter, "centre") of a sample of case locations must coincide with the exact point from which local transmission began. Here we show that this premise is erroneous. SC2024 put forward two alternative centres (centroid and mode) to the centre-point which was used by Worobey et al. for some analyses, and proposed a bootstrapping method, based on their premise, to test whether a particular location is consistent with it being the point source of transmission. We show that SC2024's concerns about the use of centre-points are inconsequential, and that use of centroids for these data is inadvisable. The mode is an appropriate, even optimal, choice as centre; however, contrary to SC2024's results, we demonstrate that with proper implementation of their methods, the mode falls at the entrance of a parking lot at the market itself, and the 95% confidence region around the mode includes the market. Thus, the market cannot be rejected as central even by SC2024's overly stringent statistical test. I think this response is pretty strong. In one analysis, they show that even though the other paper’s methodology is worse than theirs, if you apply it correctly (instead of inappropriately excluding various cases like the paper’s authors did), the center of all early cases in Hubei province lands on the wet market parking lot. In another analysis, they show that the other paper’s recommended tests wouldn’t have correctly pointed to the offending water pump in the famous John Snow cholera outbreak, but theirs would have. Still, I think it’s useful to supplement fancy statistics with normal common sense, so I recommend just looking at the map of early cases: …and deciding whether you think the assumptions behind a specific statistical test are likely to debunk the idea that cases are centered around the wet market. 11. Wuhan used as a control for a 2015 serological study on SARS-related bat coronaviruses due to its urban location. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6178078/ I don’t know why this point is supposed to matter. If you mean that Wuhan isn’t directly exposed to bats, nobody ever said it was. The zoonotic theory is that wildlife carted in from other areas of China started the pandemic in the wet market. 12. Superspreader events also seen at wet markets in Beijing and Singapore (Xinfadi and Jurong). This was discussed very extensively in the debates, both in section 1 and section 3. Wet markets weren’t “superspreader locations” - in fact, the disease spread no more quickly there than anywhere else. They were the first place in those cities that the pandemic started, due to contaminated animal products. If anything, this supports zoonosis. See also my discussion with Saar on this point below. 13. WIV refuse to share their records with NIH who terminated subaward in 2022. Wider suspension over biosafety concerns. https://www.bloomberg.com/news/articles/2023-07-18/us-suspends-wuhan-institute-funds-over-covid-stonewalling Although WIV has not been especially forthcoming, some of their databases were leaked in various ways and showed that they did not have any viruses capable of transforming into COVID. 14. PLA involvement at WIV and MERS research prior to SARS-COV-2. MERS features several similarities with SARS-CoV-2. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7022351/ I can’t even tell what conspiracy theory you’re trying to propose with this one; if you spell it out I can try to explain why it might be false. 15. SARS1 leaked several times and SARS-COV-2 has leaked from a BSL-3 lab in Taiwan. Agreed that SARS leaked several times. It also spilled over from animals several times. During the debate, a lab leak rate of once per lab per 500 years was proposed (everyone agreed to steelman this by 10x for WIV numbers); I would be interested to know whether anything about the study of SARS challenges that number. 16. Unpublished infectious clone identified from Wuhan contradicting arguments such reverse genetics systems would be published. https://www.biorxiv.org/content/10.1101/2023.02.12.528210v1.full I asked some scientists about this paper and here’s what they told me. Wuhan University sequenced some rice. In the middle of the sequence, there’s an unexpected sequence from a common coronavirus, HKU4. The most likely explanation is that someone else in Wuhan was working on the coronavirus and there was cross-contamination. Plausibly this is Wuhan Institute of Virology, who is known to work with coronaviruses. This is cool detective work, but it’s not clear what it’s supposed to prove. I think some lab leakers are using it to prove that WIV can do reverse genetics, but they admitted this already in a published paper so that’s not too helpful. I think others are using it to prove WIV had “secret viruses” in their catalogue, but the rice virus wasn’t secret, it was HKU4, which is common and which WIV has already published papers about. 1.6: DrJayChou’s 7 Arguments Once again, I cannot stress enough how much better a take you might have on this debate if you watch it. “The first known case predates the market outbreak by a month” - this is not the consensus position. I cannot say for sure what Dr. Chou means by this, but I suspect he’s referring to one of the many claims to this effect that Peter effectively debunked during the debate (Connor Reed, Mr. Chen, the 92 cases, Brazil, etc).
Inline links: blog, writes, this study, wrote a blog post on some of these issues, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10234839/, this paper, https://twitter.com/BiophysicsFL/status/1667232580255490053?t=IJgitS5cw364ioclzVWxaA&s=19, https://twitter.com/BiophysicsFL/status/1752800486837678377?t=EpIRgyybJVaPgeMP5xdstA&s=19, https://www.biorxiv.org/content/10.1101/2022.10.18.512756v1, https://link.springer.com/article/10.1007/s10311-021-01211-0?fbclid=IwAR1HMUMtLIAFOFppVasQDeoIAYrVhP8j4YoPO4wnaTOUiKLsllZl_oKryOw, https://t.co/50kFV9zSb6, https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/34398234/, https://academic.oup.com/bioinformatics/article/38/10/2719/6553661, here, describes it as, https://academic.oup.com/ve/advance-article/doi/10.1093/ve/vead089/7504441, https://journals.asm.org/doi/10.1128/mbio.00313-23, https://academic.oup.com/jrsssa/advance-article-abstract/doi/10.1093/jrsssa/qnae021/7632556, https://academic.oup.com/jrsssa/advance-article-abstract/doi/10.1093/jrsssa/qnad139/7557954, Confirmation Of The Centrality Of The Huanan Market Among Early COVID-19 Cases, https://substackcdn.com/image/fetch/$s_!BNAm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fffd4cddb-6e3e-41f5-8ef6-ec0b27bec600_626x426.webp, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6178078/, https://www.bloomberg.com/news/articles/2023-07-18/us-suspends-wuhan-institute-funds-over-covid-stonewalling, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7022351/, https://www.biorxiv.org/content/10.1101/2023.02.12.528210v1.full, a published paper, has already published papers about, https://substackcdn.com/image/fetch/$s_!yA9U!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467dd304-190a-4437-8920-d498c433dffb_1600x960.jpeg
I’m not a virologist, but I question how this comparison works. Surely HKU1 got its insert on some specific day. If you take the virus the day before, and then the other virus the day after, there will be no differences except the insert, and it will look just like COVID (ie an insert without many other mutations). The fact that the COVID comparison has few mutations, and the HKU1 insert has many mutations, just shows that whatever older virus we chose to compare HKU1 to is more distant from HKU1 than BANAL-52 (or whatever) is from COVID. Or am I missing something here? [The evidence that China tried to cover up zoonosis from the start] is untrue. They clearly said from the start this is a zoonotic spillover at HSM, and at least part of the government went to immense efforts to identify the animal, close farms, etc. (and of course couldn’t find any infected animal). Only in late 2020 did they start suspecting an import from cold-chain products after having multiple outbreaks that seem related to cold-chain products. From a Vox article from March 2023: From the start, the Chinese government interfered with efforts by both Chinese and international experts to study the pandemic, including its origins. Reporting by the AP found that even as WHO officials were publicly praising China’s cooperation, behind the scenes they were complaining about lack of access and a refusal to share data. Within months of the beginning of the pandemic, the Chinese government imposed restrictions on academic research into the origins of the novel coronavirus … China’s intransigence wasn’t unusual — countries are rarely eager to confirm that they’re the source of a deadly disease — but it went beyond the norm. International investigators weren’t permitted to see the market until more than a year after the pandemic began and a WHO-affiliated team was allowed a highly choreographed and controlled visit. The resulting report that came out of the Wuhan visit, which dismissed the possibility of a lab origin, pointed the finger at some kind of zoonotic spillover while concluding that it was unlikely that the spread started at the market, which surprised many experts. It also found that it was “possible” that the virus had been introduced via contaminated frozen food products from abroad. While few experts took that possibility seriously, it fit a narrative the Chinese government had been pushing, against nearly all evidence, that the pandemic had in fact not originated in China. “China just doesn’t want to look bad,” Filippa Lentzos, a biosecurity expert at King’s College London, told Science last August. “They need to maintain an image of control and competence. And that is what goes through everything they do.” […] it seems clear that with more cooperation, scientists could have been looking at raccoon dogs a year or more ago. “The big issue right now is that this data exists and that it is not readily available to the international community,” Maria Van Kerkhove, the WHO’s Covid-19 technical lead, told reporters on Friday. “This is first and foremost absolutely critical, not to mention that it should have been made available years earlier, but that data needs to be made accessible to individuals who can access it, who can analyze it and who can discuss it with each other.” The irony is that by making it so difficult to properly investigate a zoonotic origin of Covid, the Chinese government has created a vacuum that has been filled by claims on all sides, including the much more damning accusation that the pandemic was the result of a lab error at the Wuhan Institute of Virology. For what it’s worth, my timeline of Chinese denials and coverups looks like this: December: COVID doesn't exist, it's all lies Early January: Fine, it exists, but it’s just some wet market thing that can't spread from person to person Late January: Fine, it can spread from person to person, but we’ve got it under control now. February: Fine, it’s out of control, but you would not believe how great our response was. We're basically heroes. March: COVID was a US bioweapon, or possibly came from Italy. April: Chinese people are banned from researching the origins of COVID without government permission. 2: Comments Arguing Against Lab Leak 2.1: Is the pandemic starting near WIV reverse correlation? randomstringofcharacters wrote: Isn't [the pandemic starting near the lab] a reverse correlation issue? The lab is situated there because it's an area where coronaviruses were found in the past. Many people had this question, but Wuhan Institute of Virology was founded in 1956, didn’t originally focus on coronaviruses, and isn’t in a coronavirus hot spot. Most of WIV’s coronavirus samples come from Yunnan, about a thousand miles away. COVID’s closest relatives were found in Laos, almost two thousand miles away. During the debate, both Saar and Peter calculated the odds of a natural pandemic arising in Wuhan by dividing the population of Wuhan by the total urban population of East Asia (Saar) or South China (Peter). Saar got 1.5%, Peter got 3% (he later said this could be as high as 10% because it was a central hub in the wildlife trade). This isn’t an Official Position and I don’t think anyone else shares it, but during the debate Peter pointed out a few times that there are plenty of disease-ridden bats in Hubei (the province Wuhan is in), and that it’s not impossible that a bat virus currently known only in Laos could be active in Hubei. Still, this is the minority viewpoint and most scientists just think it involved something about the wildlife trade. 3: Other Points That Came Up 3.1: Apology to Peter re: extreme odds quiet_NaN wrote: Hot take: Peter clearly failed to convince anyone. The lab leak odds, in log10 (i.e. orders of magnitude are): Peter -20.7 Saar 2.7 Eric -3.1 Will -2.5 Scott -1.2 Daniel -1.4 One of these numbers is clearly an outlier. Scott mentions it and calls it "trolling", I would argue that it is debating in bad faith. 2e-21 is a ratio which is just silly. For one thing, the gain of function at WiV pathway is not the only pathway towards a lab leak. The WIV could also have released a naturally occurring coronavirus at the wet market. At 2e-21 odds, we would probably have to consider the possibility that the WIV built a time machine and went back in time to infect the wet market. I might have screwed up here - or at least I should have emphasized the “trolling” part. Peter complained about my presentation of his extreme-odds slide, saying: This is basically accurate. During the debate, Saar gave lots of different numbers. I don’t want to say exactly what the different numbers meant, because in earlier drafts of my post, Saar said I misunderstood them. My impression were that some of his numbers were conservative, others were central, others were extreme, others were adjusted-for-out-of-model-error, others were not-adjusted, etc. In an early draft of the post, I gave higher numbers for Saar. Saar asked me to replace them with the numbers I ended up using. I decided to agree, because I wanted to represent Saar fairly with the numbers he most centrally believed, but also because these were closest to the numbers on his Rootclaim site so it wasn’t like he was making them up just to fool me. Peter didn’t argue quite as hard, and also he didn’t have anything like the Rootclaim site, so I just took his first set of numbers. Trying to piece things together, I think a reasonable summary would be: During the debate, Saar mentioned 700-million-to-one odds in favor of lab leak, not because he thought this was plausible, but just as a discussion of where the situation would end up if you didn’t adjust for human fallibility.