COVID is a recurring concept in the Astral Codex Ten archive, appearing 84 times across 84 issues between January 29, 2021 and February 19, 2026. The archive places it in contexts such as "Weyl wrote this essay a few months before COVID"; "50% of Americans really want the COVID vaccine immediately"; "saying 'TRUST THIS GUY'. If a random shmuck who doesn't know anything about anything Googles 'who should I trust about COVID?'". It most often appears alongside US, China, Scott.
- Article page
- COVID
- Mention count
- 84
- Issue count
- 84
- First seen
- January 29, 2021
- Last seen
- February 19, 2026
- http://benjaminrosshoffman.com/d-is-for-covid
- https://arguablywrong.home.blog/2024/04/09/how-likely-is-it-for-covid-to-establish-itself/
- https://arxiv.org/abs/2403.05859
- https://astralcodexten.substack.com/p/adumbrations-of-aducanumab
- https://astralcodexten.substack.com/p/classifieds-thread-62022/comment/6898105?s=w
- https://astralcodexten.substack.com/p/replication-attempt-bisexuality-and
- https://astralcodexten.substack.com/p/sorry-i-still-think-i-am-right-about/comment/11501846
- https://dailysceptic.org/2022/07/07/twice-as-many-vaccine-deaths-as-covid-deaths-in-u-s-households-poll-finds/
- https://docs.google.com/forms/d/e/1FAIpQLSdj9Blt7KfcZb79W4zFrnW5-MGPoK6WGUtSvek8Ab4SEFSaOg/viewform
- https://docs.google.com/forms/d/e/1FAIpQLSfnbE2_d4UhA9XuW7PjL0tNNkQQAiUybPo4Y34ahGkOqSVvGA/viewform?usp=sf_link
- https://ethics.harvard.edu/Covid-Roadmap
- https://fortune.com/2020/07/15/coronavirus-vaccine-this-year-prediction-markets-coronavirus/
- Contra Weyl On Technocracy
- Metaculus Monday
- WebMD, And The Tragedy Of Legible Expertise
- 21
- Coronavirus: Links, Discussion, Open Thread
- Vitamin D: Much More Than You Wanted To Know
- 2020 Predictions: Calibration Results
- Prospectus On Próspera
- Mantic Monday: Predictions For 2021
- Your Book Review: The Wizard And The Prophet
- Book Review: How Asia Works
- When Does Worrying About Things Trade Off Against Worrying About Other Things?
- Adumbrations Of Aducanumab
- Kids Can Recover From Missing Even Quite A Lot Of School
- Highlights From The Comments On Aducanumab
- If You're So Smart, Why Aren't You Governor Of California?
- Highlights From The Comments On Missing School
- On Hreha On Behavioral Economics
- Long COVID: Much More Than You Wanted To Know
- Too Good To Check: A Play In Three Acts
- Links For October
- Non-Cognitive Skills For Educational Attainment Suggest Benefits Of Mental Illness Genes
- 15
- Ivermectin: Much More Than You Wanted To Know
- Highlights From The Comments On Ivermectin
- Pascalian Medicine
- Open Thread 200
- Open Thread 201
- Does Georgism Work? Part 1: Is Land Really A Big Deal?
- The FDA Has Punted Decisions About Luvox Prescription To The Deepest Recesses Of The Human Soul
- ACX Grants Results
- Links For December
- Movie Review: Don't Look Up
- There's A Time For Everyone
- Grading My 2021 Predictions
- Predictions For 2022
- ACX Grants ++: The Second Half
- Information Markets, Decision Markets, Attention Markets, Action Markets
- Links For April
- Contra Hoffman On Vitamin D Dosing
- Highlights From The Comments On Xi Jinping
- California Gubernatorial Candidates From Z to Z
- 22
- Your Book Review: Public Choice Theory And The Illusion Of Grand Strategy
- Highlights From The Comments On San Fransicko
- Your Book Review: The Society Of The Spectacle
- Links For July
- Your Book Review: Exhaustion
- Highlights From The Comments On Semaglutide
- What Your Doctor Spends 80% Of Their Time Doing
- Sorry, I Still Think I Am Right About The Media Very Rarely Lying
- Highlights From The Comments On The Media Very Rarely Lying
- Response To Alexandros Contra Me On Ivermectin
- Mostly Skeptical Thoughts On The Chatbot Propaganda Apocalypse
- Henrietta Lacks Seems Like A Nice Person, But Not A Scientific Hero
- Links For February 2023
- Grading My 2018 Predictions For 2023
- Highlights From The Comments On IRBs
- Attempts To Put Statistics In Context, Put Into Context
- Highlights From The Comments On British Economic Decline
- Highlights From The Comments On Kidney Donation
- Links For January 2024
- 24
- Less Utilitarian Than Thou
- Practically-A-Book Review: Rootclaim $100,000 Lab Leak Debate
- Highlights From The Comments On The Lab Leak Debate
- The Compounding Loophole
- 24
- SB 1047: Our Side Of The Story
- H5N1: Much More Than You Wanted To Know
- Links For January 2025
- 1DaySooner's Trump II Health Policy Proposals
- Lives Of The Rationalist Saints
- The Other COVID Reckoning
- The Evidence That A Million Americans Died Of COVID
- Sorry, I Still Think MR Is Wrong About USAID
- ACX Grants 1-3 Year Updates
- Your Review: The Astral Codex Ten Commentariat (“Why Do We Suck?”)
- ACX Grants Results 2025
- Links For October 2025
- Vibecession: Much More Than You Wanted To Know
- Links For December 2025
- Highlights From The Comments On Vibecession
- Crime As Proxy For Disorder
5. Coronavirus lockdowns: The government appointed a set of supposedly infallible scientist-priests to determine when people were or weren't allowed to engage in normal economic activity. The scientist-priests, who knew nothing about the complex set of factors that make one person decide to go to a rock festival and another to a bar, decided that vast swathes of economic activity they didn't understand must stop. The ordinary people affected tried to engage in the usual mechanisms of democracy, like complaining, holding protests, and plotting to kidnap their governors - but the scientist-priests, certain that their analyses were "objective" and "fact-based", thought ordinary people couldn't possibly be smart enough to challenge them, and so refused to budge.
Weyl wrote this essay a few months before COVID, so his pooh-poohing of the idea that there might be a biological catastrophe is an unfortunate anachronism. But I think it's important to note that we got this right (and he got it wrong) precisely because we "privilege rationalist approaches over all other forms of knowledge-making". People like Toby Ord tried to calculate the risk of every kind of disaster and how bad it would be - and at the same time Weyl was making fun of us for caring about biological catastrophes, Ord was writing about how the numbers suggested zoonotic diseases from bats could cause catastrophic pandemics. This kind of work ultimately led to EA flagship group Open Philanthropy Project spending almost $50 million on its Biosecurity And Pandemic Preparedness Program between 2014 and 2019; if other people had taken a few minutes to read our arguments instead of chiding us for how naive it is to prioritize things based on rational methods, maybe the world would have been more prepared.
This week: the coronavirus.
Late last year, when coronavirus had already killed 285,000 Americans, Metaculus asked users to predict how many would be dead by the end of 2021. The guesses started at about 500,000. But as cases rose further through December and January, the guesses rose too, until now they're averaging almost 690,000 people.
When will some official like the Director of the CDC announce that a coronavirus vaccine is available to any adult who wants it (as opposed to just front-line workers, only seniors, etc)? The question as asked is a little odd, since it will probably be available in some areas before others, but this is apparently about some kind of nationwide announcement.
This is actually a widespread problem in medicine. The worst offender is the FDA, which tends to list every problem anyone had while on a drug as a potential drug side effect, even if it obviously isn't. This got some press lately when Moderna had to disclose to the FDA that one of the coronavirus vaccine patients got struck by lightning; after a review, this was declared probably unrelated. For the more serious version of this, read Get Ready For False Side Effects. Why does the FDA keep doing this if they know it makes their label information useless? My guess is it's because they don't want to look like cowboys who unprincipledly consider some things but not other things. What if someone accused the person deciding what things to consider of being biased? So the FDA comes up with a Procedure, and once you have a Procedure it has to be "take everything seriously", and then it falls on random small-fry people who aren't the FDA to pick up the slack and explain which side effects are worth worrying about or not, and then those small fries don't do that, because they could get sued.
The way I imagine this is that Zvi reads some papers on whether the coronavirus has airborne transmission, sees the direction they're leaning, and announces on his blog that it probably has airborne transmission.
If you're planning the coronavirus response, maybe the best thing you can do is lock Zvi in a cave completely incommunicado and make him write one for you. The moment there's a gap in the cave, thousands of lobbyists and activists and politicians will rush in, trying to sue him or bribe him or threaten him or guilt him into changing it to favor their constituency. If he has the slightest shred of self-preservation, the end result will be some balance between the good plan he would have written earlier, and the stuff he needs to include to avoid getting sued or fired or cancelled or universally-loathed or mobbed.
Kalshi is "the first regulated futures exchange dedicated to trading on event outcomes". As far as I can tell they actually did it. They got the government to agree to let them run a prediction market, regulated as investing rather than gambling. There will be a $25,000 max on contracts, so it might not be enough for the real whales, but that's still thirty times higher than existing prediction markets and enough to probably be a sea change. I talked to a representative who said they'll be "focused on creating markets in a few key categories such as climate, economics, geopolitics, energy, education, government, space, COVID, and technology", but that they remain open to potentially expanding to all sorts of other areas (except onions, which apparently have a very specific carve-out as something which it is illegal to have futures markets about, somebody please write a cyberpunk novel revolving around this). So far they're still setting things up, but they describe themselves as "ramping quickly towards launch". I am super psyched about this and will keep you all updated.
So far there have been three waves of coronavirus cases in the US. The first wave was the beginning, when it caught us unprepared. The second wave was in July, when we got sloppy and lifted lockdowns too soon. The third wave was November through January, because the coronavirus is seasonal and winter is its season (also probably the holidays). From Johns Hopkins CRC:
A fourth wave may hit in March, when the more contagious B117 strain from the UK takes over. Expect more shelter-in-place orders, school shutdowns, and a spike in cases at least the size of July's, maybe December's. That will last until May-ish, when the usual control system (more virus -> stricter lockdowns -> less virus -> looser lockdowns -> more virus) moves back into the "less virus" stage. Also coronavirus is seasonal and summer isn't its season. Also by that time a decent chunk of the population will be vaccinated. The worst consequences of the UK strain should burn themselves out by late spring.
We should also be concerned about a fifth wave (possibly overlapping with the fourth wave; they may not have obviously separate peaks). Virologists have identified two new strains, one in South Africa, one in Brazil, which probably have "immune escape" - the ability to infect people who have already gotten, recovered from, and developed antibodies to the original strain (or been vaccinated against it). Both strains already have a few cases in the US. It will take them a few months to spread to the point where they're relevant, but they should eventually be the majority of new cases.
Most health articles ask you to act on their opinions. I am specifically asking you not to act on mine. In a moment, I'll tell you whether or not I think Vitamin D prevents or treats coronavirus. But I'll give you a free spoiler: I am less than 100% certain of what I'm about to say. So if you want to take Vitamin D, take it. If it does prevent or cure coronavirus, great. If not, the worst that will happen is you'll have slightly better bone health. I can't stress how much I don't want to be those people who said they couldn't prove face masks helped so you must not use face masks. Just ignore everything I'm saying, do a quick cost-benefit calculation, and take Vitamin D. That having been said:
Lots of people think Vitamin D treats coronavirus, and some of them have good evidence. For example, infection rate from coronavirus seems latitude dependent; in general, the further north an area, the worse it's been hit. Northern areas get less sunlight, and sunlight helps produce Vitamin D, so whenever you see a disease that's worse at high latitudes, Vitamin D should be on your short list of potential causes.
Also - in the US, COVID seemed to remit with the summer and worsen over the winter. It's hard to distinguish this from general exponential growth and from the effect of playing ping-pong with gradually loosening/ tightening lockdowns, but the US spike this winter was pretty dramatic. Most Northern Hemisphere countries show such a pattern, most equatorial countries don't, and some Southern Hemisphere countries arguably show the opposite. Whenever you see a disease that's better in summer and worse in winter, Vitamin D is one of the possible culprits.
CORONAVIRUS: 1. Bay Area lockdown (eg restaurants closed) will be extended beyond June 15: 60% 2. …until Election Day: 10% 3. Fewer than 100,000 US coronavirus deaths: 10% 4. Fewer than 300,000 US coronavirus deaths: 50% 5. Fewer than 3 million US coronavirus deaths: 90% 6. US has highest official death toll of any country: 80% 7. US has highest death toll as per expert guesses of real numbers: 70% 8. NYC widely considered worst-hit US city: 90% 9. China’s (official) case number goes from its current 82,000 to 100,000 by the end of the year: 70% 10. A coronavirus vaccine has been approved for general use and given to at least 10,000 people somewhere in the First World: 50% 11. Best scientific consensus ends up being that hydroxychloroquine was significantly effective: 20% 12. I personally will get coronavirus (as per my best guess if I had it; positive test not needed): 30% 13. Someone I am close to (housemate or close family member) will get coronavirus: 60% 14. General consensus is that we (April 2020 US) were overreacting: 50% 15. General consensus is that we (April 2020 US) were underreacting: 20% 16. General consensus is that summer made coronavirus significantly less dangerous: 70% 17. …and there is a catastrophic (50K+ US deaths, or more major lockdowns, after at least a month without these things) second wave in autumn: 30% 18. I personally am back to working not-at-home: 90% 19. At least half of states send every voter a mail-in ballot in 2020 presidential election: 20% 20. PredictIt is uncertain (less than 95% sure) who won the presidential election for more than 24 hours after Election Day. 20%
PROFESSIONAL 72. I’ve gotten at least one new patient to do a full wake therapy protocol: 60% 73. I have specific, set-in-motion plans to quit work / start my own business: 5% 74. I work the same schedule and locations I did before the coronavirus: 80% 75. I get a bonus for 2020: 20%
Most of my mistakes were correlated with two big errors: expecting COVID lockdown to last much shorter than it did, and missing the NYT situation and its aftermath. This isn’t an excuse - part of what you’re supposed to do in being calibrated is understand the possibility that black swan events could happen that throw everything off.
There was a weird event which I don't have a great perspective on, where Erick Brimen, CEO of HPI, scheduled a meeting in Crawfish Rock to explain what Próspera was and why people didn't need to be afraid of it. The municipal government banned him from having the meeting, supposedly because of COVID. Brimen felt he was being silenced, and said that he would have the meeting outdoors and observe all social distancing guidelines but otherwise wasn't going to back off. He held the meeting, the municipal government sent police to shut it down, and after some resistance it got shut down (see video below).
Never underestimate embarrassment as a driver of progress. When the US was dragging its feet on COVID vaccines, Israel vaccinated its population quickly and safely, and that embarrassed us enough to get the ball rolling. Or: the US is terrible at building infrastructure. We’re starting to discuss ways to fix this, but we wouldn’t have realized anything was wrong without stories about China constructing high-speed rail at lightning speed, or memories of our grandparents building Hoover Dam and the interstate highways.
COVID 23. Fewer than 10K daily average official COVID cases in US in December 2021: 30% 24. Fewer than 50K daily average COVID cases worldwide in December 2021: 1% 25. Greater than 66% of US population vaccinated against COVID: 50% 26. India's official case count is higher than US: 50% 27. Vitamin D is not generally recognized (eg NICE, UpToDate) as effective COVID treatment: 70% 28. Something else not currently used becomes first-line treatment for COVID: 40% 29. Some new variant not currently known is greater than 25% of cases: 50% 30. Some new variant where no existing vaccine is more than 50% effective: 40% 31. US approves AstraZeneca vaccine: 20% 32. Most people I see in the local grocery store aren't wearing a mask: 60%
I was a Wizard once, but then I took an arrow to the knee. I mean… then COVID-19 happened. If you had asked me before March 2020 whether I thought we could science our way out of a slow-grind, long-term disaster scenario like climate change, I would have said categorically yes. As long as we have some Borlaugs out there laser-focusing on the problem, there’s no actual danger that we’ll be constructing Waterworld-style life rafts over the flooded remnants of our formerly glorious civilization. But, uh, I’m no longer so sure.
If you look around, you’ll see lots of other COVID-like problems out there that are quietly but inexorably claiming lives and dragging down average utility worldwide – poverty, homelessness, economic stagnation – that Wizards haven’t found good solutions for. I don’t think it’s from a lack of trying; I think we may have hit a carrying capacity limit on our ability to deal with complexity. Systems in the modern world are complex. No, really complex. No, even more complex than that. Consider all of the different systems that interacted to form the giant clusterfuck that is the COVID-19 pandemic response: local politics, global politics, scientific knowledge production, scientific knowledge dissemination, the media, social media, business, regulations, logistics chains. Each of these contain multitudes of factors that no single human on earth, not even the Normanest of Borlaugs, could keep straight in his or her head and "fix" with a single quick hack like a better strain of wheat. And these complex systems aren’t just statically complex – they seem to be getting more complex over time in their interactions with each other. In the 1960s, Borlaug’s new wheat strain was used by virtually all Mexican farmers a year after it was released commercially; if it had been created today, it probably would have sat on a shelf for a decade while various global FDA-like agencies dickered over whether it was safe or not and anti-GMO groups launched a thousand frankenwheat memes; you’d definitely never be able to buy it at Whole Foods.
The problem with the concept of carrying capacity – impossible to define or know when we’ve surpassed, until after the proverbial moment when killing one mosquito or dovekie too many plunges us into everlasting fire and brimstone – is that it seems mostly like a mood affiliation thing. Consider this: when you look out at the bird feeder hanging in your backyard (Bay Area people living in closets, use your imagination here), do you see a blissful symbiotic coexistence of the human world and nature, with humans generously giving of our growth-accumulated abundance to help the birds flourish? Or do you see a dystopian struggle where innocent creatures pushed to the brink of death by our traffic and pesticides and housecats must rely on meager scraps for survival? There isn’t really a right view here, I don’t think, just a predilection for seeing what matches your intuition and telling a story about it. Given the life story Mann depicts for Vogt, it’s not hard to see why he would lean toward the pessimistic take. From the childhood marred by a philandering father who left his family in a cloud of scandal and ruin, to the adult life spent stumbling from one bourgeois non-occupation (theater critic, bird watcher, government mole rooting out Nazis in South America) to the next, to the childlessness and divorce, Vogt seems happiest when he is away from any humans, whether tromping through pre-suburbanized Long Island, or alone with the guano-producing birds on the desolate coast of Peru. Were he to live in an age of Facebook and Twitter, he would definitely be that guy reposting memes that COVID is finally letting our planet "heal."
But even though he puts it excruciatingly kindly, Studwell is kind of accusing the global economic establishment of impoverishing several dozen major countries. He goes through the recent history of Indonesia, Thailand, Malaysia, and the Philippines, and shows how in each case, the global economic establishment (especially the IMF and World Bank) convinced them not to protect infant industries, not to institute capital controls, and not to stress manufacturing too much. Then each of those countries suffered financial crises and their development stagnated. It really is striking how the countries that did the best were the ones that gave the world establishment the middle finger (unless of course this is cherry-picking and there are lots of big countries that followed IMF advice and did great). To whatever degree this is true, it belongs on the list of science failures that should keep us up at night, alongside all those people saying not to wear masks for COVID. Except that the COVID mistake lasted a few months and killed a 5-6 digit number of people, and the IMF advice lasted decades and kept hundreds of millions in poverty.
(2): Instead of worrying about Republican obstructionism in Congress, we should worry about the potential for novel variants of COVID to wreak devastation in the Third World.
Here’s a toy model that could potentially rescue Acemoglu’s argument: two topics can either be complements to each other, or substitutes for each other. The more closely related the topics, the more likely that one or the other is true, but it’s hard to ever tell exactly which, let alone what the exact effect size is. So concern about police brutality and concern about police evidence fabrication are probably complements - the more you produce of one, the more people want of the other. Concern about Republican obstructionism and about COVID are neutral; changes in one don’t affect the price of the other (I mean, at the ground level they’re competing for a limited amount of newspaper column space, but this is no more interesting than the fact that grapes and rockets are competing for a limited amount of labor/land/capital - in the real world, vineyards and space programs aren’t competitors in any meaningful sense). Concern about some other topics could be substitutes, where the more of one you produce, the less people will want of the other. And it just so happens that the only two substitute goods in the entire intellectual universe are concern about near-term AI risk, and concern about long-term AI risk.
Here’s another good example: coronavirus vaccines. The FDA still has not fully approved any coronavirus vaccine. The only reason you’re allowed to get vaccinated at all is because of a fast-track provisional approval somewhat like the one used for aducanumab. Coronavirus vaccines have probably also averted a few hundred thousand deaths.
The countries that got through COVID the best (eg South Korea and Taiwan) controlled it through test-and-trace. This allowed them to scrape by with minimal lockdown and almost no deaths. But it only worked because they started testing and tracing really quickly - almost the moment they learned that the coronavirus existed. Could the US have done equally well?
I think yes. A bunch of laboratories, universities, and health care groups came up with COVID tests before the virus was even in the US, and were 100% ready to deploy them. But when the US declared that the coronavirus was a “public health emergency”, the FDA announced that the emergency was so grave that they were banning all coronavirus testing, so that nobody could take advantage of the emergency to peddle shoddy tests. Perhaps you might feel like this is exactly the opposite of what you should do during an emergency? This is a sure sign that you will never work for the FDA.
I hear this is happening again now, with more school closures, more frantic parents, and more people asking awful questions like “should I accept the risk of sending my immunocompromised kid to school, or should I accept him falling behind and never amounting to anything?”
Source. New Orleans’ ACT scores improved from pre-Katrina to post-Katrina, even though the post-Katrina kids missed a year or two of school. I don’t want to trivialize the hard work that educators and school reformers probably put in to catch them up - but given that hard work, they caught up. I think educators will also put a lot of work into catching up kids who have to miss school because of COVID. Some parents "unschool" their children. That is, they object to schooling as traditionally understood, so they register themselves as home schooling but don't formally teach much, limiting themselves to answering kids' questions as they come up. When adjusted for confounders (ie usually these parents are rich and well-educated), their young children lag one grade level behind public school students on average - but only one (though these students were pretty young and they might have lagged further behind with time). By the time these unschooled kids are applying for college, they seem to know a decent amount, get into college at relatively high rates, and do well in their college courses. I think there’s some evidence that not getting any school at all harms these children’s performance on some traditional measures. But it doesn’t harm them very much. Given how little effect there is from absolutely zero school ever, I think missing a year or two of school isn’t going to matter a lot.
If these claims are right, then ten years from now, we won’t find a huge effect of COVID lockdowns on student learning.
Still, like every other good idea, doctors are prone to following this one off a cliff. Suppose a study proves that mask mandates decrease coronavirus cases. Can we assume that they also prevent coronavirus hospitalizations and deaths? Seems pretty obvious that they should - but this is the dreaded surrogate endpoint, so purists will insist we continue the study for months longer to get a big enough sample of COVID hospitalizations/deaths to reach significance. Supposing that we prove that mask mandates prevent COVID deaths, can we be sure that the people who die will still be dead next year? Isn’t “dead now” technically just a surrogate endpoint for “dead over the long-term”, which is really what we care about? At some point you have to have faith in Bayes, don’t you? So in the general case I’m kind of split on this. Still, in the specific case of aducanumab we actually have specific, positive evidence that the surrogate endpoint doesn’t capture the real endpoint, so obviously this is bad.
> [Texas Governor] Greg Abbott received a third booster dose of the vaccine, something not yet available to the public. Now he’s receiving monoclonal antibody treatment, something the FDA has only authorized for those with “high risk for progression to severe COVID-19.” Must be nice.
Even though I’m arguing against Scott’s recent post, I want to preface this by saying I don’t really disagree with his broad conclusion all that much, which is that if you could only increase or decrease the strictness of the FDA, broadly defined, you should probably decrease the strictness. However, I think the story is complicated and if you want the best reforms, you need a better sense of where the FDA has gone right and wrong in the past. Also, I completely agree that regarding COVID-19, the FDA has been far too slow on approving testing, vaccines, etc.
Governor Gavin Newsom had a bad year. First he pissed off Republicans with his strong response to COVID. Then he pissed off the people who wanted strong responses to COVID by attending an unvaccinated unmasked dinner. Also, taxes are still high, homelessness is still high, rents are still (too damn) high, and parts of the state are literally on fire. Gavin Newsom didn't cause most of this, but he also hasn't announced any particularly inspired plans to fight it. Just a really, really bad year.
Yeah, I have some anti-school positions. But I was trying not to show them in the post. The post was aimed at generally pro-school people who were freaking out over the possibility of their kids missing a few months of school because of COVID, and I tried to meet them on their level. The fact that you can miss a few months of school and still do well academically, is no different in principle from that fact that you can miss a few months of training, and still become a world-class athlete. Great athletes miss a few months of training all the time, for injuries or something, and nobody ever says “oh, she missed six months of training, now she’ll never catch up to all those other athletes who have six months more training than she does”.
Somewhere in this process, they did an experiment where they gave participants a quarter minted in Denver and asked them if they wanted to exchange it for a quarter minted in Philadelphia. 60% of people very reasonably didn’t care, but another 35% had grown attached to their Denver quarter, with only 5% actively seeking the novelty of Philadelphia. Psychology is weird. I understand why some people would summarize this paper as “loss aversion doesn’t exist”. But it’s very different from “power posing doesn’t exist” or “stereotype threat doesn’t exist”, where it was found that the effect people were trying to study just didn’t happen, and all the studies saying it did were because of p-hacking or publication bias or something. People are very often averse to losses. This paper just argues that this isn’t caused by a specific “loss aversion” force. It’s caused by other forces which are not exactly loss aversion. We could compare it to centrifugal force in physics: real, but not fundamental. Also, you can’t use this paper to argue that “behavioral economics is dead”. At best, the paper proves that loss aversion is better explained by other behavioral economic concepts. But you can’t get rid of behavioral econ entirely! The stuff you have to explain is still there! It’s just a question of which parts of behavioral econ you use to explain it. Complicating this even further is Mrkva et al, Loss Aversion Has Moderators, But Reports Of Its Death Are Greatly Exaggerated (h/t Alex Imas, who has a great Twitter thread about this). This is an even newer paper, 2019, which argues that Gal and Rucker are wrong, and loss aversion does have an independent existence as a real force. There are many things to like about this paper. Previous criticisms of loss aversion argue that most experiments are performed on undergrads, who are so poor that even small amounts of money might have unusual emotional meaning. Mrkva collects a sample of thousands of millionaires (!) and demonstrates that they show loss aversion for sums of money as small as $20. On the other hand, I’m not sure they’re quite as careful as G&R at ruling out every other possible bias (although I don’t have a great understanding of where the borders between biases are and I can’t say this for sure). The main point I want to make is that all the scientists in this debate seem smart, thoughtful, and impressive. This isn’t like social priming experiments where one person says a crazy thing, nobody ever replicates it at scale, and as soon as someone tries the whole thing collapses. These have been replicated hundreds of times, with the remaining arguments being complicated semantic and philosophical ones about how to distinguish one theory from a very slightly different theory. If that takes replicating your result on a sample of thousands of millionaires, people will gather a sample of thousands of millionaires and get busy on the replication. Just overall really impressive work. I don’t feel qualified to take a side in the G&R vs. Mkrva debate, but both teams make me really happy that there are smart and careful people considering these questions. And this is just a drop in the bucket. Alex Imas also links Replicating patterns of prospect theory for decision under risk, which says: Though substantial evidence supports prospect theory, many presumed canonical theories have drawn scrutiny for recent replication failures. In response, we directly test the original methods in a multinational study (n = 4,098 participants, 19 countries, 13 languages), adjusting only for current and local currencies while requiring all participants to respond to all items. The results replicated for 94% of items, with some attenuation. Twelve of 13 theoretical contrasts replicated, with 100% replication in some countries. Heterogeneity between countries and intra-individual variation highlight meaningful avenues for future theorizing and applications. We conclude that the empirical foundations for prospect theory replicate beyond any reasonable thresholds. Beyond any reasonable thresholds! IV. Do Nudges Work? or, How Small Is Small? Continuing through the Hreha article: For a number of years, I've been beating the anti-nudge drum. Since 2011, I've been running behavioral experiments in the wild, and have always been struck by how weak nudges tend to be. In my experience, nudges usually fail to have *any* recognizable impact at all. This is supported by a paper that was recently published by a couple of researchers from UC Berkeley. They looked at the results of 126 randomized controlled trials run by two "nudge units" here in the United States. I want you to guess how large of an impact these nudges had on average... 30%? 20%? 10%? 5%? 3%? 1.5%? 1%? 0%? If you said 1.5%, you'd be right (the actual number is 1.4%, but if I had written that out you would have chosen it because of its specificity). According to the academic papers these nudges were based upon, these nudges should have had an average impact of 8.7%. But, as you probably understand by now, behavioral economics is not a particularly trustworthy field. I actually emailed the authors of this paper, and they thought the ~1% effect size of these interventions was something to be applauded—especially if the intervention was cheap & easy. Unfortunately, no intervention is truly cheap or easy. Every single intervention requires, at the very minimum, administrative overhead. If you're going to do something, you need someone (or some system) to implement and keep track of it. If an intervention is only going to get you a 1% improvement, it's probably not even worth it. Uber infamously had a team of behavioral economists working on its product, trying to “nudge” people in the right direction. Relatedly, Uber makes $10 billion in yearly revenue. If they can “nudge” people to spend 1% more, that’s $100 million. That’s not much relative to revenue, but it’s a lot in absolute terms. In particular, it pays the salary of a lot of behavioral economists. If you can hire 10 behavioral economists for $100,000 a year and make $100 million, that’s $99 million in profit. Or what if you’re a government agency, trying to nudge people to do prosocial things? There are about 90 million eligible Americans who haven’t gotten their COVID vaccine, and although some of them are hard-core conspiracy theorists, others are just lazy or nervous or feel safe already. (source) Whoever decided on that grocery gift card scheme was nudging, whether or not they have an economics degree - and apparently they were pretty good at it. If some sort of behavioral econ campaign can convince 1.5% of those 90 million Americans to get their vaccines, that’s 1.4 million more vaccinations and, under reasonable assumptions, maybe a few thousand lives saved. Hreha says that: Every single intervention requires, at the very minimum, administrative overhead. If you're going to do something, you need someone (or some system) to implement and keep track of it. If an intervention is only going to get you a 1% improvement, it's probably not even worth it. This depends on scale! 1% of a small number isn’t worth it! 1% of a big number is very worth it, especially if that big number is a number of lives! A few caveats. First, a small number only matters if it’s real. It’s very easy to get spurious small effects, so much so that any time you see a small effect you should wonder if it’s real. I’m ready to be forgiving here because behavioral economics is so well-replicated and common-sensically true, but I wouldn’t blame anyone who steers clear. Second, Hreha says: To be honest, you can probably use your creativity to brainstorm an idea that will get you a 3-4% minimum gain, no behavioral economics "science" required. Which leads me to the final point I'd like to make: rules and generalizations are overrated. The reason that fields like behavioral economics are so seductive is because they promise people easy, cookie-cutter solutions to complicated problems. Figuring out how to increase sales of your product is hard. You need to figure out which variables are responsible for the lackluster interest. Is the price the issue? Is the product too hard to use? Is the design tacky? Is the sales organization incompetent? Is the refund/return policy lacking? etc. Exploring these questions can take months (or years) of hard work, and there's no guarantee that you'll succeed. If, however, a behavioral economist tells you that there are nudges that will increase your sales by 10%, 20%, or 30% without much effort on your part... Whoa. That's pretty cool. It's salvation. Thus, it's no surprise that governments and companies have spent hundreds of millions of dollars on behavioral "nudge" units. Unfortunately, as we've seen, these nudges are woefully ineffective. Specific problems require specific solutions. They don't require boilerplate solutions based on general principles that someone discovered by studying a bunch of 19 year old college students. However, the social sciences have done a good job of convincing people that general principles are better solutions for problems than creative, situation-specific solutions. In my experience, creative solutions that are tailor-made for the situation at hand *always* perform better than generic solutions based on one study or another. Hreha is a professional in this field, so presumably he’s right. Still, compare to medicine. A thoughtful doctor who tailors treatment to a particular patient sounds better (and is better) than one who says “Depression? Take this one all-purpose depression treatment which is the first thing I saw when I typed ‘depression’ into UpToDate”. But you still need medical journals. Having some idea of general-purpose laws is what gives the people making creative solutions something to build upon. (also, at some point your customers might want to check your creative solution to see whether it actually gives a “3-4% minimum gain, no behavioral economics required”, and that would be at least vaguely study-shaped.) Third, everyone who said nudging had vast effects is still bad and wrong. Many of them were bad and wrong and making fortunes consulting for companies about how to implement the policies they were claiming were super-powerful. This is suspicious and we should lower our opinion of them accordingly. In a previous discussion of growth mindset, I wrote: Imagine I claimed our next-door neighbor was a billionaire oil sheik who kept thousands of boxes of gold and diamonds hidden in his basement. Later we meet the neighbor, and he is the manager of a small bookstore and has a salary 10% above the US average... Should we describe this as “we have confirmed the Wealthy Neighbor Hypothesis, though the effect size was smaller than expected”? Or as “I made up a completely crazy story, and in unrelated news there was an irrelevant deviation from literally-zero in the same space”? All the people talking about oil sheiks deserve to get asked some really uncomfortable questions. And a lot of these will be the most famous researchers - the Dan Arielys of the world - because of course the people who successfully hyped their results a lot are the ones the public knows about. Still, the neighbor seems like a neat guy, and maybe he’ll give you a job at his bookstore. V. Conclusion: Musings On The Identifiable Victim Effect I actually skipped the very beginning of Hreha’s article. I want to come back to it now. It begins: The last few years have been particularly bad for behavioral economics. A number of frequently cited findings have failed to replicate. Here are a couple of high profile examples: The Identifiable Victim Effect (featured in the workbooks I wrote with Dan Ariely and Kristen Berman in 2014)
(source) Whoever decided on that grocery gift card scheme was nudging, whether or not they have an economics degree - and apparently they were pretty good at it. If some sort of behavioral econ campaign can convince 1.5% of those 90 million Americans to get their vaccines, that’s 1.4 million more vaccinations and, under reasonable assumptions, maybe a few thousand lives saved. Hreha says that: Every single intervention requires, at the very minimum, administrative overhead. If you're going to do something, you need someone (or some system) to implement and keep track of it. If an intervention is only going to get you a 1% improvement, it's probably not even worth it. This depends on scale! 1% of a small number isn’t worth it! 1% of a big number is very worth it, especially if that big number is a number of lives! A few caveats. First, a small number only matters if it’s real. It’s very easy to get spurious small effects, so much so that any time you see a small effect you should wonder if it’s real. I’m ready to be forgiving here because behavioral economics is so well-replicated and common-sensically true, but I wouldn’t blame anyone who steers clear. Second, Hreha says: To be honest, you can probably use your creativity to brainstorm an idea that will get you a 3-4% minimum gain, no behavioral economics "science" required. Which leads me to the final point I'd like to make: rules and generalizations are overrated. The reason that fields like behavioral economics are so seductive is because they promise people easy, cookie-cutter solutions to complicated problems. Figuring out how to increase sales of your product is hard. You need to figure out which variables are responsible for the lackluster interest. Is the price the issue? Is the product too hard to use? Is the design tacky? Is the sales organization incompetent? Is the refund/return policy lacking? etc. Exploring these questions can take months (or years) of hard work, and there's no guarantee that you'll succeed. If, however, a behavioral economist tells you that there are nudges that will increase your sales by 10%, 20%, or 30% without much effort on your part... Whoa. That's pretty cool. It's salvation. Thus, it's no surprise that governments and companies have spent hundreds of millions of dollars on behavioral "nudge" units. Unfortunately, as we've seen, these nudges are woefully ineffective. Specific problems require specific solutions. They don't require boilerplate solutions based on general principles that someone discovered by studying a bunch of 19 year old college students. However, the social sciences have done a good job of convincing people that general principles are better solutions for problems than creative, situation-specific solutions. In my experience, creative solutions that are tailor-made for the situation at hand *always* perform better than generic solutions based on one study or another. Hreha is a professional in this field, so presumably he’s right. Still, compare to medicine. A thoughtful doctor who tailors treatment to a particular patient sounds better (and is better) than one who says “Depression? Take this one all-purpose depression treatment which is the first thing I saw when I typed ‘depression’ into UpToDate”. But you still need medical journals. Having some idea of general-purpose laws is what gives the people making creative solutions something to build upon. (also, at some point your customers might want to check your creative solution to see whether it actually gives a “3-4% minimum gain, no behavioral economics required”, and that would be at least vaguely study-shaped.) Third, everyone who said nudging had vast effects is still bad and wrong. Many of them were bad and wrong and making fortunes consulting for companies about how to implement the policies they were claiming were super-powerful. This is suspicious and we should lower our opinion of them accordingly. In a previous discussion of growth mindset, I wrote: Imagine I claimed our next-door neighbor was a billionaire oil sheik who kept thousands of boxes of gold and diamonds hidden in his basement. Later we meet the neighbor, and he is the manager of a small bookstore and has a salary 10% above the US average... Should we describe this as “we have confirmed the Wealthy Neighbor Hypothesis, though the effect size was smaller than expected”? Or as “I made up a completely crazy story, and in unrelated news there was an irrelevant deviation from literally-zero in the same space”? All the people talking about oil sheiks deserve to get asked some really uncomfortable questions. And a lot of these will be the most famous researchers - the Dan Arielys of the world - because of course the people who successfully hyped their results a lot are the ones the public knows about. Still, the neighbor seems like a neat guy, and maybe he’ll give you a job at his bookstore. V. Conclusion: Musings On The Identifiable Victim Effect I actually skipped the very beginning of Hreha’s article. I want to come back to it now. It begins: The last few years have been particularly bad for behavioral economics. A number of frequently cited findings have failed to replicate. Here are a couple of high profile examples: The Identifiable Victim Effect (featured in the workbooks I wrote with Dan Ariely and Kristen Berman in 2014)
The British Office of National Statistics looks at people with a confirmed COVID test three months ago, and finds that 14% report having Long COVID symptoms, compared to 2% of a COVID-less control group. This is substantially lower than the earlier study, which found 33% at 6 months. Probably this is because the previous one asked about a bunch of symptoms, whereas this one just asked “Are you having Long COVID?” Lots of people who had some minor symptom or other might not have made the connection, or might have thought that their symptom didn’t qualify for a full diagnosis.
This is terrible. Recovery rates in the single digit percentages over the space of years. You would think at least some patients would get placebo recoveries, or forget how it felt to be well, or otherwise Lizardman themselves into fake complacency, but no. This is f@#$ing awful. Maybe COVID won’t be this bad? One ray of hope comes from this Australian study, where doctors record the rates of recovery from postviral fatigue after various rare diseases they encounter (Epstein-Barr, Q fever, Ross River virus). They find that 35% of these patients have postviral fatigue after six weeks, but only 12% after six months, and 9% after twelve months. This sounds a lot better than chronic fatigue. In fact, these people do the kind of weird task of figuring out how bad different diagnostic labels for fatigue are, even though some might argue that all the labels refer to the same underlying reality. They find an official diagnosis of “CFS/ME” (chronic fatigue / myalgic encephalitis) is much worse than “postviral fatigue”. Using the weird measure of “days per year of followup with diagnosis” (I’m not sure I fully understand their reasoning for why this is good), they find a median length of 80 for CFS/ME vs. 0 for PVF (…huh?). Using the more comprehensible measure of percent who still complain of fatigue after 7-12 months, they find it’s 24% vs. 10% (which super contradicts the above study saying that basically nobody with a CFS/ME diagnosis ever recovers). My guess is that this study had much lower criteria for a CFS/ME diagnosis (some doctor diagnosed it and put it on the insurance records) compared to the ones above (some specialist confirmed it by official criteria). The conclusion I draw is that, while official CFS/ME is horrible and hopeless, there are a lot of things that unofficially look kind of chronic-fatigue-ish which have pretty good prognoses. Since there’s no good reason to think post-COVID fatigue is official CFS/ME as opposed to just some chronic-ish fatigue-ish thing, probably it will have a better prognosis, more like weird Australian viruses. …which we still don’t know, because AFAICT nobody has done any good studies on postviral fatigue lasting more than a year. 5. Psychosomatic symptoms probably aren’t the majority of long COVID. I mean, I’m not seeing too many people claiming that they are. There are a lot more people worried that someone else might be claiming that, than people actually making the claim. Still, the Wall Street Journal opinion section is always up for slathering itself in glue and rolling around in a haystack until it becomes the straw man everyone else warned you about, and they do have an article on The Dubious Origins Of Long COVID. They point out that long COVID was first thrust into the public consciousness in surveys run by Body Politic, who self-describe as “a queer feminist wellness collective merging the personal and the political”. I agree this is a weird source for something to come from, but Hans Asperger was a Nazi and I still use his diagnosis, so I probably have to accept these people’s as well. More relevantly, WSJ points out that many of the people complaining of Long COVID symptoms test negative for COVID, or at least never tested positive. This complaint conflates the fact that not everyone was able to get a COVID test at all, with the fact that sometimes you get the acute COVID test after you’ve recovered from acute COVID and it’s negative, with the fact that COVID tests don’t have a 100% success rate, with the fact that yeah, okay, some people who didn’t have COVID are probably imagining Long COVID symptoms. I feel like some of the case-control studies above, which clearly show that seropositive people have higher rates of Long COVID than seronegative people, are pretty convincing here. But also - the people with lung scarring clearly have lung scarring, and most of them have weird x-rays consistent with lung scarring. If you have lung scarring, then you have trouble breathing, you’re fatigued, and you probably have lots of other stuff downstream of that. The people with smell/taste disturbances clearly have smell/taste disturbances, testable with the stupidly named but scientifically venerable Sniffin Sticks test - and also, who even cares enough to make up olfactory problems? Fatigue and brain fog are the only symptoms here that can’t be easily objectively confirmed, and, well, do you think those Australians who got infected with Q fever and had twelve months of postviral fatigue are faking? What about all those post-Epstein Barr fatigue people? Lots of viruses cause postviral fatigue, it’s not really surprising that COVID should also. (WSJ also spends a while arguing that CFS/ME is just a psychiatric disorder, which I think is not really in keeping with the best recent evidence. Also, as a psychiatrist, I’m very against this conclusion, mostly because if it were true, then people would expect me to cure CFS/ME patients.) One point WSJ didn’t bring up but could have was that most Long COVID patients are women. Probably this is somewhere between 60 and 80% - I suspect on the lower end of this, because I think women are more likely to talk about these kinds of things than men, and much more likely to eg join Facebook groups. This is noteworthy, because women are traditionally more prone to psychosomatic illnesses - so much that the ancients attributed these to the uterus and called them hysteria (note shared root with eg “hysterectomy”). Women are about 2x as likely to get diagnosed with panic disorder, anxiety disorders, phobias, etc, about 2.5x as likely to get chronic Lyme disease, widely regarded as an entirely psychosomatic condition, and 3-5x more likely to be diagnosed with fibromyalgia. So the female preponderance is suspicious. But women are also somewhere between 2x and 4x more likely to get autoimmune disorders than men (it varies by disorder - the ratio for Sjogren’s is as high as 16x). There are some pretty crazy hypotheses for why this is - for example, maybe women’s immune systems are permanently upregulated to be prepared for attempts by the placenta to secrete immune-downregulating chemicals during pregnancy, as part of the creepy shadow war between mother and fetus to regulate the maternal environment. I don’t know, do you have a better idea? Anyway, women have more autoimmune issues and more upregulated immune systems, so if there was any good way to assess gender ratio in true postviral fatigue excluding all psychosomatic cases, that would probably be female-biased too. Probably some Long COVID cases are psychosomatic just like some cases of anything are psychosomatic, but I don’t see too many signs that this is too important in explaining the phenomenon. …and please allow me a moment of preachiness here. Chronic fatigue sounds really fake to anyone who doesn’t have it. I think this is because it’s related to willpower. Willpower itself would sound fake to anyone who didn’t have to worry about it. “Oh, so you can go partying with your friends whenever you want, but as soon as it comes time to write a ten page report, your ‘lack of willpower’ prevents you from doing it? A likely story!” Still, all of us (except Bryan Caplan) recognize how real and important willpower is - how having more of it is better than having less of it, and how some condition that caused you to have pathologically little of it would be a huge disaster. In the comments section to the rough draft of this post, CJ wrote: I will say - I was one of those types of men to scoff with skepticism at people claiming to have chronic fatigue and the like. I would have called those people lazy and would have been adamant they were faking it or feeling like crap because of unhealthy lifestyle choices. Unfortunately I have learned the hard way the severity of neurological conditions, what it feels like to have brain fog, what chronic fatigue feels like, and how difficult it can be to communicate neurological symptoms to others. I now start from a position of listening to people who are willing to open up about their symptoms and trust that they are being honest. There are millions of people suffering in silence with untreated and undiagnosed disorders - those people are not all faking it or just dealing with psychosomatic conditions. I would recommend Jennifer Brea's documentary, Unrest. Thank you for shedding some light on the subject. Heron added: I second the suggestion to watch 'Unrest,' and to consider the many unseen ill whose symptoms are deemed to be imagined. Until this last year, I had little patience with, and doubted, people who I saw as hypochondriacs. Then I became the thing I hated. Myalgic Encephalomyelitis/Chronic Fatigue Syndrome and Long COVID do have similarities from what I've read, since becoming ill in August 2020. At that time, here in Northern Ireland, there was scant availability of COVID tests; after spending three days trying to get hold of one, (by which time I'd stopped teaching my post-grad online classes & I haven't worked since) I became too ill to do anything. I figured if this was COVID I'd gotten off lightly, mostly constant severe headache, inability to think, a new experience of fatigue, high temperature, insomnia, hypersomnia, paresthesia, no smell or taste etc Debilitated but not dead. Except for the fact that I still have the aforementioned symptoms a year on and whilst they fluctuate in type and severity, the fatigue, headaches and cognitive difficulties are real. A brain scan, an appointment for brain and spinal MRIs (waiting lists, even when going private [as NHS has 3-8 yr waiting lists here in NI] are lengthy), rare virtual doctors and neurologists suggest my ailments constitute a post-viral thing, maybe Long C, they can offer nothing but pills for pain. There is no test for ME/CFS yet, nor a Long C test, symptoms and presentation are so varied. Given a widespread lack of knowledge and resources regarding these ailments, you're on your own. Maybe I've developed ME, I certainly have post-exertional malaise which my very prominent neurologist hadn't heard of. Looking at the history of ME/CFS* and a dearth of research surrounding it, I hope that rather than dismiss the lives of sufferers of this or the long-lasting aftermath of COVID, that those experiencing such difficulties will be heard and learnt from. I only understood when I had no alternative. I don’t think I ever actively pooh-poohed CFS, but like everyone else who encountered it, I underestimated just how bad it was until I met some patients with the condition. It is real and really bad. For whatever reason it is hard to think about and take seriously, but it really is as bad as people say. </preachiness> 6. Long COVID is probably rare in children This matters a lot, because children are (currently) ineligible for the vaccine, and also likely to encounter the virus at school. But children usually have mild cases of COVID and don’t die from it, so it’s tempting to just not worry about them. But if they could get Long COVID, that would make it much less tempting. Preliminary Evidence On Long COVID In Children sounds like a good paper to draw conclusions from. It says 42.6% of children with COVID experience long-term follow-up symptoms, which would be higher than the rate for adults. But it has no control group, and most of the symptoms it finds don’t seem very COVID-related (eg rashes, constipation). The most common symptom (20%) is insomnia, which better studies in adults fail to associate with real Long COVID. The rate of known long COVID symptoms (eg taste and smell problems) is only about 3-4%, and no higher or lower than anything else. Probably these kids are just having problems at the usual rate and attributing them to their recent COVID. Blankenburg et al do the correct thing and ask a thousand children about potential symptoms, then compare the number who say yes vs. no among COVID-seropositive and seronegative subjects. They find no difference between the two groups. Both are reporting a lot of insomnia, etc. They reasonably attribute this to pandemics being a stressful event that it’s natural to lose sleep over. This is really reassuring, but it can’t rule out a somewhat rarer syndrome. The authors say that they might miss symptoms with a prevalence of less than 10%, and one of them gives his own personal guess that it’s 1%. An English team says there’s a Long COVID rate of 4.6% in kids. But there was a 1.7% rate of similar symptoms in the control group of kids who didn’t have COVID, so I think it would be fair to subtract that and end up with 2.9%. And even though the study started with 5000 children, so few of them got COVID, and so few of those got long COVID, that the 2.9% turns out to be about five kids. I don’t really want to update too much based on five kids, especially given the risk of recall bias (ie you might notice / care about your symptoms more if you know you had COVID before getting them). My overall conclusion here is that long COVID is rarer in children than adults, and may not exist at all. The studies tell us it’s probably somewhere less than 5% of kids, but so far we can’t conclude anything stronger than that. 7. Vaccination probably doesn’t change the per-symptomatic-case risk of Long COVID much Here’s a complicated Twitter thread about this. Of vaccinated people who got symptomatic COVID, about a third ended up with Long COVID symptoms, the same rate as in unvaccinated people. Of course, vaccinated people are much less likely to get symptomatic COVID. But even conditional on getting it, they’re still much less likely to go to the hospital, die, etc. It would have been nice if the same was true of getting Long COVID. But it doesn’t look that way. (all this information is from an online poll by a sketchy group of COVID “survivor” activists. But they wrote up their poll in the scientific paper font, as a PDF and everything, so I say we count it anyway) This NEJM study wasn’t exactly designed to look for Long COVID in vaccinated people. But they found it anyway, at a rate of 19% after 6 weeks. This also fits within the (wide) range reported for unvaccinated people. They don’t give a symptom breakdown beyond “prolonged loss of smell, persistent cough, fatigue, weakness, dyspnea, or myalgia”, which sounds like the usual set. These studies are pretty weak, and you could argue that given that vaccines decrease the average severity of COVID infection, and infection severity is linked to Long COVID risk, we should have a strong prior on vaccines decreasing Long COVID risk. And just before publishing this, someone sent me this study, which very preliminarily finds vaccines might decrease Long COVID risk by a factor of 2. I think a factor of 2-3 is believable; one of 10 or 20, less so. Weirdly, there are some claims that vaccines can help relieve symptoms of existing long COVID. Sounds kind of like sympathetic magic to me, but the researcher quoted in the linked article said it might “improve symptoms by eliminating any virus or viral remnants left in the body” or by “rebalancing the immune system”. So yeah, sympathetic magic. 8. Your risk of a terrible long COVID outcome conditional on COVID is probably between a few tenths of a percent and a few percent. My original calculation went like this: About 25% of people who get COVID report long COVID symptoms. About half of those go away after a few months, so 12.5% get persistent symptoms. Suppose that half of those cases (totally made-up number) are very mild and not worth worrying about. Then 6.25% of people who get COVID would have serious long-lasting Long COVID symptoms. After doing that calculation, I read this essay by Matt Bell, who tries to figure out the same thing. He is much more optimistic. He agrees that about half of long COVID cases go away after a few months, but adds another 50% decrease from “few months” to “lifelong”, kind of on priors, admitting there’s not too much positive evidence for this. Then he adds another factor-of-two decrease from vaccination, based on very preliminary studies from the UK. He estimates that someone with my demographics (vaccinated man in his 30s) has a 2% risk of Long COVID conditional on getting COVID at all. Then he divides by five for the true worst case scenario, based on studies showing that a fifth of people with Long COVID report that it affects their daily activities “a lot”. So by his final number, I have an 0.4% chance of getting really terrible long COVID, conditional on getting COVID at all. My friend AcesoUnderGlass also did a writeup of this, published after I did my first-draft calculation, which seems to be thinking of this very differently, based entirely on hospitalization rates (which of course are very low in vaccinated people our age). She accordingly concludes that risk is very low. I don’t really understand her reasoning here, but I trust her a lot and am working on trying to converge with her on this. What’s my yearly risk of getting COVID if I try to live a normal life? This site says only 0.1% of vaccinated Californians have gotten COVID after their vaccination. But vaccination was pretty new when that survey was done, so we might want to take this as a per one-to-two-months estimate. That would mean a risk of 0.5 - 1 percent per year. But not all these people are living normal lives, so my risk might be higher. MicroCOVID gives me a good sense of how careful I’d have to be to stay within a risk budget of 1% COVID risk per year. When I play around with it, I think I am about 5x - 10x less careful than that, which would mean a risk of about 5%/year. This tracker suggests my area has recently had about 1 new case per thousand people per week, which would imply 5% per year. But most of those people are probably unvaccinated, so my risk would be significantly lower than that. I’m going to round all of this off to about 1% - 10% per year of getting a breakthrough COVID case (though obviously this could change if the national picture got better or worse). Combined with the 0.4% to 6.25% risk of getting terrible long COVID conditional on getting COVID, that’s between a 1/150 - 1/25,000 chance of terrible long COVID per year. How does this compare to other risks? My ordinary risk of death per year, just from being a man in his 30s, is about 1/700 (though this includes drug abusers and stunt pilots, so my real risk might be lower, let’s say 1/1000). Here are some other risks, courtesy of the BMJ: In this context, I find the 1/150 risk pretty scary and the 1/25,000 risk not scary at all, so, darn, I guess there’s not yet enough data to have a strong sense of how concerned I should be. 9. This is hard to compare to other postviral syndromes Going into this, I wondered if we might be able to ignore Long COVID. The argument would go like this: all viral diseases have a risk of postviral syndromes. Colds, flus, mono, lots of stuff that’s going around all the time. Lots of people get those postviral syndromes, and either recover or don’t, but either way we don’t make a big deal out of it. Since COVID’s considered “newsworthy” in a way flu isn’t, we obsess over its postviral syndrome even though it’s no worse than anything else’s. This wouldn’t make Long COVID any less bad, and maybe we would be wrong to not panic more about colds and the flu, but it would at least give us some context and make things feel less scary. Unfortunately, I can’t find anything supporting or opposing this picture. The only relevant study is a meta-analysis by Poole-Wright et al, who (contra nominative determinism) don’t pool the studies by condition, which makes it hard to draw conclusions. I think all of their examples of postviral syndrome after flu are from severe hospitalized cases, so any comparison with COVID would be unfair. Although there do seem to be scattered reports of post-flu problems, they’ve never been formally studied or quantified. Mononucleosis is an infectious disease caused by the Epstein-Barr virus, affecting about 1/2000 people per year in developed countries. It has a famously nasty postviral syndrome, which this paper describes as “almost one-half of the group had substantial ongoing symptoms 2 months after onset and… ∼10% had disabling symptoms marked by fatigue lasting ≥ 6 months”. Flu is as common as COVID, but nobody really talks about it having a significant postviral syndrome so probably it’s not that bad. Mono has a worse postviral syndrome than COVID, but it’s rare enough that it doesn’t cause massive society-wide effects. COVID is right in the middle: more common than mono, and (probably) worse postviral syndrome than flu. I think it’s fair to say that we may not have encountered a condition with this exact combination of risk factors and can’t dismiss it as similar to conditions we currently ignore. One potential analogue might be the Spanish Flu of 1918. It was an equally widespread pandemic, and seemed to have some kind of postviral syndrome. From TIME: In what is now Tanzania, to the north, post-viral syndrome has been blamed for triggering the worst famine in a century—the so-called “famine of corms”—after debilitating lethargy prevented flu survivors from planting when the rains came at the end of 1918. “Agriculture suffered particular disruption because, not only did the epidemic coincide with the planting season in some parts of the country, but in others it came at the time for harvesting and sheep-shearing.” Kathleen Brant, who lived on a farm in Taranaki, New Zealand, told Rice, the historian, about the “legion” problems farmers in her district encountered following the pandemic, even though all patients survived: “The effects of loss of production were felt for a long time.” The 1918 flu seemed to have lots of psychiatric effects: “Norwegian demographer Svenn-Erik Mamelund provided such evidence when he combed the records of psychiatric institutions in his country to show that the average number of admissions showed a seven-fold increase in each of the six years following the pandemic, compared to earlier, non-pandemic years.” Coronavirus doesn’t - the excellent Amin-Chowdhury study above finds nothing. Still, this is the scale of thing I’m worried about. The worst case scenario here is really really bad. If a few percent of COVID patients get long-term unremitting genuine CFS/ME, that has the potential to overwhelm government welfare budgets and long-term depress the economy. I think there’s a 90% chance the real situation isn’t that bad, but it’s scary that we can’t entirely rule it out. Aside from the somewhat different 1918 case, I don’t think we have any historical experience of dealing with postviral syndromes at this scale. The medium case scenario is something more like “a few percent of infected people get moderate fatigue, which doesn’t really prevent them from working, and goes away after a few years”. I don’t know whether the level of media attention paid to this would converge on “boring and nobody notices” or “giant disaster”, and I think it would be compatible with either. 10. Conclusions 1. Long COVID is many different issues without a common mechanism. 2. Some of these are straightforward and not surprising, eg lung scarring and post-ICU syndrome from severe infection, and would happen in any disease of this severity. Others seem to be more like the poorly-understood postviral syndromes associated with several other diseases. While some symptoms may be psychosomatic, most are probably organic. 3 The three major categories of symptoms are straightforward cardiovascular-pulmonary issues, straightforward smell and taste issues, and more mysterious neurological issues. 4 Although these get better with time in some people, in a significant number (maybe ~50% of people who had them at six weeks) they persist for as long as anyone has been able to measure them (a few months in the case of COVID, a year or two in the case of comparable syndromes). 5. Post-COVID fatigue is particularly concerning. This would be very bad if we analogized it to CFS/ME, and still pretty bad if we analogized it to other known postviral syndromes. There is no proof that this always gets better over the long term, although no study has looked at them for more than a few years. Facing postviral fatigue on this scale is a new problem. 6 . Children probably get Long COVID less than adults, probably at a rate of less than 5% of symptomatic cases. But we don’t know how much less, and we can’t rule out that some children get pretty severe symptoms. 7. Although vaccination decreases the risk of symptomatic COVID, it probably doesn’t decrease the risk of Long COVID per symptomatic COVID case by very much, though it might decrease it by a factor of 2-3. 8. Your chance of really bad debilitating lifelong Long COVID, conditional on getting COVID, is probably somewhere between a few tenths of a percent, and a few percent. Your chance per year of getting it by living a normal lifestyle depends on what you consider a normal lifestyle and on the future course of the pandemic. For me, under reasonable assumptions, it’s probably well below one percent. EDIT: Here are some other people who tried to do this same analysis. I learned about all of these after I wrote the first draft of this, so you can consider the basic thought process here to be independent of them - but I edited some things to account for what I learned from them before writing the final version. AcesoUnderGlass: Long COVID Is Not Necessarily Your Biggest Problem
In this context, I find the 1/150 risk pretty scary and the 1/25,000 risk not scary at all, so, darn, I guess there’s not yet enough data to have a strong sense of how concerned I should be. 9. This is hard to compare to other postviral syndromes Going into this, I wondered if we might be able to ignore Long COVID. The argument would go like this: all viral diseases have a risk of postviral syndromes. Colds, flus, mono, lots of stuff that’s going around all the time. Lots of people get those postviral syndromes, and either recover or don’t, but either way we don’t make a big deal out of it. Since COVID’s considered “newsworthy” in a way flu isn’t, we obsess over its postviral syndrome even though it’s no worse than anything else’s. This wouldn’t make Long COVID any less bad, and maybe we would be wrong to not panic more about colds and the flu, but it would at least give us some context and make things feel less scary. Unfortunately, I can’t find anything supporting or opposing this picture. The only relevant study is a meta-analysis by Poole-Wright et al, who (contra nominative determinism) don’t pool the studies by condition, which makes it hard to draw conclusions. I think all of their examples of postviral syndrome after flu are from severe hospitalized cases, so any comparison with COVID would be unfair. Although there do seem to be scattered reports of post-flu problems, they’ve never been formally studied or quantified. Mononucleosis is an infectious disease caused by the Epstein-Barr virus, affecting about 1/2000 people per year in developed countries. It has a famously nasty postviral syndrome, which this paper describes as “almost one-half of the group had substantial ongoing symptoms 2 months after onset and… ∼10% had disabling symptoms marked by fatigue lasting ≥ 6 months”. Flu is as common as COVID, but nobody really talks about it having a significant postviral syndrome so probably it’s not that bad. Mono has a worse postviral syndrome than COVID, but it’s rare enough that it doesn’t cause massive society-wide effects. COVID is right in the middle: more common than mono, and (probably) worse postviral syndrome than flu. I think it’s fair to say that we may not have encountered a condition with this exact combination of risk factors and can’t dismiss it as similar to conditions we currently ignore. One potential analogue might be the Spanish Flu of 1918. It was an equally widespread pandemic, and seemed to have some kind of postviral syndrome. From TIME: In what is now Tanzania, to the north, post-viral syndrome has been blamed for triggering the worst famine in a century—the so-called “famine of corms”—after debilitating lethargy prevented flu survivors from planting when the rains came at the end of 1918. “Agriculture suffered particular disruption because, not only did the epidemic coincide with the planting season in some parts of the country, but in others it came at the time for harvesting and sheep-shearing.” Kathleen Brant, who lived on a farm in Taranaki, New Zealand, told Rice, the historian, about the “legion” problems farmers in her district encountered following the pandemic, even though all patients survived: “The effects of loss of production were felt for a long time.” The 1918 flu seemed to have lots of psychiatric effects: “Norwegian demographer Svenn-Erik Mamelund provided such evidence when he combed the records of psychiatric institutions in his country to show that the average number of admissions showed a seven-fold increase in each of the six years following the pandemic, compared to earlier, non-pandemic years.” Coronavirus doesn’t - the excellent Amin-Chowdhury study above finds nothing. Still, this is the scale of thing I’m worried about. The worst case scenario here is really really bad. If a few percent of COVID patients get long-term unremitting genuine CFS/ME, that has the potential to overwhelm government welfare budgets and long-term depress the economy. I think there’s a 90% chance the real situation isn’t that bad, but it’s scary that we can’t entirely rule it out. Aside from the somewhat different 1918 case, I don’t think we have any historical experience of dealing with postviral syndromes at this scale. The medium case scenario is something more like “a few percent of infected people get moderate fatigue, which doesn’t really prevent them from working, and goes away after a few years”. I don’t know whether the level of media attention paid to this would converge on “boring and nobody notices” or “giant disaster”, and I think it would be compatible with either. 10. Conclusions 1. Long COVID is many different issues without a common mechanism. 2. Some of these are straightforward and not surprising, eg lung scarring and post-ICU syndrome from severe infection, and would happen in any disease of this severity. Others seem to be more like the poorly-understood postviral syndromes associated with several other diseases. While some symptoms may be psychosomatic, most are probably organic. 3 The three major categories of symptoms are straightforward cardiovascular-pulmonary issues, straightforward smell and taste issues, and more mysterious neurological issues. 4 Although these get better with time in some people, in a significant number (maybe ~50% of people who had them at six weeks) they persist for as long as anyone has been able to measure them (a few months in the case of COVID, a year or two in the case of comparable syndromes). 5. Post-COVID fatigue is particularly concerning. This would be very bad if we analogized it to CFS/ME, and still pretty bad if we analogized it to other known postviral syndromes. There is no proof that this always gets better over the long term, although no study has looked at them for more than a few years. Facing postviral fatigue on this scale is a new problem. 6 . Children probably get Long COVID less than adults, probably at a rate of less than 5% of symptomatic cases. But we don’t know how much less, and we can’t rule out that some children get pretty severe symptoms. 7. Although vaccination decreases the risk of symptomatic COVID, it probably doesn’t decrease the risk of Long COVID per symptomatic COVID case by very much, though it might decrease it by a factor of 2-3. 8. Your chance of really bad debilitating lifelong Long COVID, conditional on getting COVID, is probably somewhere between a few tenths of a percent, and a few percent. Your chance per year of getting it by living a normal lifestyle depends on what you consider a normal lifestyle and on the future course of the pandemic. For me, under reasonable assumptions, it’s probably well below one percent. EDIT: Here are some other people who tried to do this same analysis. I learned about all of these after I wrote the first draft of this, so you can consider the basic thought process here to be independent of them - but I edited some things to account for what I learned from them before writing the final version. AcesoUnderGlass: Long COVID Is Not Necessarily Your Biggest Problem
In case you find this hard to follow: ivermectin is an antiparasitic drug that looked promising against COVID in early studies. Later it started looking less promising, and investigators found that a major supporting study was fraudulent. But by this point it had gotten popular among conspiracy theorists as a suppressed coronavirus cure that They Don’t Want You To Know. The media has tried to spread the word that the scientific consensus remains skeptical. In the process, they may have gone a little overboard and portrayed it as the world’s deadliest toxin that will definitely kill you and it will all somehow be Donald Trump’s fault. It turned into the latest culture war issue, and now there’s a whole discourse on (for example) how supposedly-sober fact-checkers keep calling it "a horse dewormer” (it is used to deworm horses, but it’s also FDA-approved for humans, but lots of the people using it are buying the horse version), and probably this is hypocritical in some way. Enter the article above. A doctor named Jason McElyea apparently told local broadcaster KFOR that Oklahoma hospitals are “overwhelmed” with ivermectin poisoning cases, so much so that “gunshot victims” are “left waiting”. Some of the world’s biggest news outlets heard the story and ran with it. The tweet mentions the Rolling Stone version, but the same story, with the same doctor’s testimony, got picked up by The Guardian, the BBC, Yahoo News, etc. Which brings us to the Sequoyah Hospital letter on the right. They released a statement saying that Dr. McElyea hasn’t worked there in two months, they haven’t had any ivermectin overdose cases, and they don’t know what he’s talking about. In the comments, author Virginia Hume sums up the situation nicely: I’ve recently been reading Scout Mindset (expect a review soon), which is kind of the rationalist movement in book form. It focuses on the difference between how we treat ideas that conform to our biases versus those that contradict them. If they conform, we ask “Are we allowed to accept this?” and wave them through, like a small town police chief dealing with a case involving the mayor’s son. If they contradict, we subject them to the harshest inquisition possible, like a small town police chief dealing with a guy named “Abdullah” with a sinister-looking beard. The media was already looking to discredit ivermectin. So the report of one doctor - without even a phone call to confirm - was good enough for Rolling Stone, The Guardian, BBC, etc. It was “too good to check”. II. Did you believe that? I did, briefly. Then I remembered the Law Of Rationalist Irony: the smugger you feel about having caught a bias in someone else, the more likely you are falling victim to that bias right now, in whatever way would be most embarrassing. So, quick check: am I doing this? I notice this story is exactly tailored to appeal to me and people like me. It discredits the media establishment, who I don’t like. It’s a great argument for why we need more rationality, something I’ve been trying to push. It lets me feel superior to everyone: I am properly skeptical of ivermectin, but also I haven’t become a contemptible propagandist who joins in mass media smear campaigns. And I didn’t even take a second to check if it was true! I’m relying entirely on the word of a Twitter bluecheck I’ve never heard of before, whose profile picture is some kind of dog (an Australian sheepdog? maybe some kind of weird collie?) Forget making a phone call to a hospital, I didn’t even read the original article! The story was “too good to check”! So I tried checking, and noticed that the third reply to the original tweet was this: In case you’re as confused as I am, NHS here = “Northeastern Health System”, an Oklahoma health care group. Britain is not involved. This…turns out to be completely true. The story never mentions Sequoyah Hospital! Dr. McElyea has worked at Sequoyah in the past, but he’s a traveling doctor and works lots of places. Plausibly Sequoyah just wanted to clarify that they weren’t like the hospitals in the story, they’re not turning away gunshot victims, and if you happen to be a gunshot victim you’re still welcome to go to Sequoyah and can expect timely care. Apparently I’m not the only person who doesn’t scroll down to the third tweet. The right-wing Washington Examiner has an article on how Rolling Stone’s Ivermectin Fiction Shows Why Republicans Don’t Trust Media. Fox has an article on Rolling Stone Forced To Issue Update After Viral Ivermectin Story Turns Out To Be False. One Redditor puts it more bluntly: “Dr. Jason McElyea, who has been claiming that emergency rooms have been turning away gunshot victims because of Ivermectin overdoses, is a liar.” None of these sources mentioned that the original article had never claimed Sequoyah Hospital was involved. Their story was - I guess - too good to check. III. Did you believe that? I mean, that’s also a pretty cool story, isn’t it? Right-wing news outlets accuse the so-called “liberal media” of bias, then get hoist on their own petard? Seems a bit too cute. Have you clicked through to any of the links yet? No? Not even after I admitted I’m probably biased here? Sequoyah Hospital might not be the particular hospital that the doctor in the story was thinking of. But isn’t it suspicious that other hospitals are so packed with ivermectin cases that they have to delay care to gunshot victims, yet Sequoyah says that it “has not treated any patients due to complications of treating ivermectin”? Seems weird for there to be that much difference. Okay, this time I promise I’m not trying to psych you out. Here’s what I’ve actually been able to figure out about this situation: Rolling Stone seems to think that the Sequoyah Hospital statement casts doubt on their account. They changed the title of their article to “One Hospital Denies Oklahoma Doctor’s Story…” and edited in a long prologue about the hospital’s statement in a way that suggests they feel bad about their reporting. They say that they have reached out to various relevant doctors and hospitals for comments but have not heard back from them - which I guess is good, because if your hospital is so busy that you don’t have time to treat gunshot victims, you really shouldn’t have time to give interviews to Rolling Stone.
Rolling Stone seems to think that the Sequoyah Hospital statement casts doubt on their account. They changed the title of their article to “One Hospital Denies Oklahoma Doctor’s Story…” and edited in a long prologue about the hospital’s statement in a way that suggests they feel bad about their reporting. They say that they have reached out to various relevant doctors and hospitals for comments but have not heard back from them - which I guess is good, because if your hospital is so busy that you don’t have time to treat gunshot victims, you really shouldn’t have time to give interviews to Rolling Stone.
In an unrelated issue, the photo on top of their article was previously a bunch of Oklahoman-looking people standing in a long line outside a building. This had a caption, in small print, saying that it was of Oklahomans waiting for the COVID vaccine. Critics pointed out that in context, people would have interpreted it as being a picture of people waiting outside a hospital which had long lines because it was too full of ivermectin victims. Whether or not that criticism was fair, Rolling Stone has taken down that photo and replaced it with a photo of ivermectin pills.
8: Hopefully not related: self-defeating admonitions to Trust Science (look at that scatterplot and that trend line!) pnas.org/content/118/40…\n\n(h/t @EricTopol) ","username":"celinegounder","name":"Céline Gounder, MD, ScM, FIDSA","profile_image_url":"","date":"Mon Sep 27 22:38:08 +0000 2021","photos":[{"img_url":"https://pbs.substack.com/media/FAU2tNFXMAQxePB.jpg","link_url":"https://t.co/9EysfDM57O","alt_text":null}],"quoted_tweet":{},"reply_count":0,"retweet_count":15,"like_count":51,"impression_count":0,"expanded_url":{},"video_url":null,"belowTheFold":true}" data-component-name="Twitter2ToDOM"> 9: MR: This Experiment Will Be Run: New York Public Library, in order to protect “vulnerable communities” and “grapple with inequality”, eliminates late fees for books. But before making a snap judgment based on your preconceptions (or on the library president’s last name) read the comments (wait, when did MR comments start being good?!) which explain that this has already been tried in many other cities, you still can’t take out new books until you return or replace the old ones, and having a potential monetary fine looming over your head for forgetting something turns a lot of people off (especially poor people, but also everyone else). I think the best lens for this is behavioral econ - fines were a kind of “reverse nudge” that made people nervous and unhappy far out of proportion to any good they did, so the library system is being restructured to route around them.
16: Early in the COVID pandemic, I linked to a theory that getting a smaller dose of virus meant less severe disease (so, for example, a mask that blocked 95% of viruses would still be useful, even though 5% of viruses is enough to infect you). NEJM recently published an evidence-free article vaguely against this, and Stephan Guyenet says it doesn’t always apply for other diseases.
Both cognitive and non-cognitive skills increase your income a lot, no surprise there. Both sets of skills improve your lifespan (probably more educated people are better at judging health advice - get your COVID vaccine!) and prevent you from making bad decisions like teenage pregnancy, smoking, or excessive drinking.
(I also wouldn’t be too impressed even if the forecasters did get the same findings as Brauner et al, because one likely route to that would be the same one I took - you’ve resolved to judge various coronavirus interventions, you notice Brauner et al is clearly the best paper, and so you report its results.)
The paper continues to an empirical study. The authors ran a forecasting tournament on various easily-checkable things like COVID vaccinations, commodity prices, and the weather. Forecasters were separated into three conditions: reciprocal scoring, traditional scoring (ie Brier score + incentives), and no scoring. The no scoring team did worse than the normal scoring team, which is the basic insight Tetlock et al have found again and again: scored and incentivized forecasts are better than random people pontificating on things. But more relevantly for this paper, the reciprocal scoring and traditional scoring did basically the same!
Then they tried something more ambitious. They asked teams to “predict” the number of lives saved by various COVID interventions. These interventions had already happened or not, there was no way to ever empirically resolve the predictions. This was supposed to serve as an example of the exciting new things you can do with reciprocal scoring.
This is from ivmmeta.com, part of a sprawling empire of big professional-looking sites promoting unorthodox coronavirus treatments. I have no idea who runs it - they’ve very reasonably kept their identity secret - but my hat is off to them. Each of these study names links to a discussion page which extracts key outcomes and offers links to html and pdf versions of the full text. These same people have another 35 ivermectin studies with different inclusion criteria, subanalyses by every variable under the sun, responses and counterresponses to everyone who disagrees with them about every study, and they’ve done this for twenty-nine other controversial COVID treatments. Putting aside the question of accuracy and grading only on presentation and scale, this is the most impressive act of science communication I have ever seen. The WHO and CDC get billions of dollars in funding and neither of them has been able to communicate their perspective anywhere near as effectively. Even an atheist can appreciate a cathedral, and even an ivermectin skeptic should be able to appreciate this website. What stands out most in this image (their studies on early treatment only; there are more on other things) is all the green boxes on the left side of the table. A green box means that the ivermectin group did better than placebo (a red box means the opposite). This isn’t adjusted for statistical significance - indeed, many of these studies don’t reach it. The point of a meta-analysis is that things that aren’t statistically significant on their own can become so after you pool them with other things. If you see one green box, it could mean the ivermectin group just got a little luckier than the placebo group. When you see 26 boxes compared to only 4 red ones, you know that nobody gets that lucky. Acknowledging that this is interesting, let’s detract from it a little. First, this presentation can exaggerate the effect size (represented by how far the green boxes are to the left of the gray line in the middle representing no effect). It focuses on the most dire outcome in every study - death if anybody died, hospitalization if anyone was hospitalized, etc. Most studies are small, and most COVID cases do fine, so most of these only have one or two people die or get hospitalized. So the score is often something like “ivermectin, 0 deaths; placebo, 1 death”, which is an infinitely large relative risk, and then the site rounds it down to some very high finite number. This methodology naturally produces very big apparent effects, and the rare studies where ivermectin does worse than placebo are equally exaggerated (one says that ivermectin patients are 600% more likely to end up hospitalized). But this doesn’t change the basic fact that ivermectin beats placebo in 26/30 of these studies. Second, this presents a pretty different picture than you would get reading the studies themselves. Most of these studies are looking at outcomes like viral load, how long until the patient tests negative, how long until the patient’s symptoms go away, etc. Many of these results are statistically insignificant or of low effect size. I went through these studies and tried to get some more information for my own reference: Click to expand. # is how many people were in the smallest relevant group (eg if there were 20 people in placebo and 10 in ivermectin, it was 10). Dose is ivermectin dose x number of days. Tested w/ is what drugs were given alongside ivermectin; compare is what drugs were in the “placebo” group (I excluded some very common things like paracetamol). %-PCR7 is what percent of patients had a negative PCR test (indicating recovery) after 7 days (though if 7 wasn’t available, I accepted anything from 6-12); the (I) and (P) are ivermectin and placebo groups. R is the ratio - green if statistically significant, red otherwise. DaysPCR is how many days it took to get a negative PCR test. Days to -sym are how many days it took symptoms to resolve. -outc is some serious negative outcome in the study, either clinical worsening, hospitalization, or death. I was inconsistent which one I chose, trying to pick whichever I thought struck a balance between high sample size and severity. Since this was almost never significant, I made it blue if it favored ivermectin and orange if it favored placebo (which it never did; there is no orange). Lowest p is the lowest p-value in the study for one of the headline results. 1o+ is whether the primary outcome was positive or not. I made this very quickly and unprincipledly and I am sure there are a lot of errors; please forgive me. Of studies that included any of the endpoints I recorded, ivermectin had a statistically significant effect on the endpoint 13 times, and failed to reach significance 8 times. Of studies that named a specific primary endpoint, 9 found ivermectin affected it significantly, and 12 found it didn’t. But that’s still pretty good. And “doesn’t affect to a statistically significant degree” doesn’t mean it doesn’t work. It might just mean your study is too small for a real and important effect to achieve statistical significance. That’s why people do meta-analyses to combine studies. And the ivmmeta people say they did that and it was really impressive. All of this is still basically what things would look like if ivermectin worked. But of course we can’t give every study one vote. We’ve got to actually look at these and see which ones are good and which ones are bad. So, God help us, let’s go over all thirty of the ivermectin studies in this top panel of ivmmeta.com. (if you get bored of this, scroll down to the section called “The Analysis”) The Studies Elgazzar et al: This one isn’t on the table above, but we can’t start talking about the others until we get it out of the way. 600 Egyptian patients were randomized into six groups, including three that got ivermectin. The ivermectin groups did substantially better: for example, 2 vs. 20 deaths in ivermectin group 3 vs. non-ivermectin group 4. There were various other equally impressive outcomes. Unfortunately, it’s all false. Some epidemiologists and reporters were able to obtain the raw data (it was password-protected, but the password was “1234”), and it was pretty bizarre. Some patients appeared to have died before the trial started; others were arranged in groups of four such that it seemed like the authors had just copy-pasted the same four patients again and again. Probably either the study never happened, or at least the data were heavily edited afterwards. You can read more here. A lot of the apparent benefit of ivermectin in meta-analyses disappeared after taking out this paper (though remember, this isn’t even on the table at the top of the post, so it doesn’t directly affect that). Since the Elgazzar debacle, a group of researchers including Gideon Meyerowitz-Katz, Kyle Sheldrake, James Heathers, Nick Brown, Jack Lawrence, etc, have been trying to double-check as many other ivermectin studies as possible. At least three others - Samaha, Carvallo, and Niaee - have similar problems and have been retracted. Those studies were all removed before I screenshotted the table above, and they’re not on there. But everybody is pretty paranoid right now and looking for fraud a lot harder than they might be in normal situations. Moving on: Chowdury et al: Bangladeshi RCT. 60 patients in Group A got low-dose ivermectin plus the antibiotic doxycycline, 56 in Group B got hydroxychloroquine (another weird COVID treatment which most scientists think doesn’t work) plus the antibiotic azithromycin. No declared primary outcome. Ivermectin group got to negative PCR a little faster than the other (5.9 vs. 7 days) but it wasn’t statistically significant (p = 0.2). A couple of other non-statistically-significant things happened too. 2 controls were hospitalized, 0 ivermectin patients were. This is a boring study that got boring results, so nobody has felt the need to assassinate it, but if they did, it would probably focus on both groups getting various medications besides ivermectin. None of these other medications are believed to work, so I don’t really care about this, but you could tell a story where actually doxycycline works great at addressing associated bacterial pneumonias, or where HCQ causes lots of side effects and that makes the ivermectin group look good in comparison, or whatever. Espitia-Hernandez et al: Mexican trial which is probably not an RCT - all it says is that “patients were voluntarily allocated”. 28 ended up taking a cocktail of low-dose ivermectin, vitamin D, and azithromycin; 7 were controls. On day ten, everyone (!) in the experimental group was PCR negative; everyone (!) in the control group was still positive. Also, symptoms in the experimental group lasted an average of three days; in the control group, more like 10. These results make ivermectin look amazingly super-good, probably better than any other drug for any other disease, except maybe stuff like vitamins for treatment of vitamin deficiency. Any issues? We don’t know how patients were allocated, but they discuss patient characteristics and they don’t look different enough to produce this big an effect size. The experimental group got a lot of things other than ivermectin, but I would be equally surprised if vitamin D or azithromycin cured COVID this effectively. It deviated from its preregistration in basically every way possible, but you shouldn’t be able to get “every experimental patient tested negative when zero control patients did” by garden-of-forking-paths alone! But this has to be false, right? Even the other pro-ivermectin studies don’t show effects nearly this big. In all other studies combined, ivermectin patients took an average of 8 days to recover; in Espitia-Hernandez, they took 3. Also, it’s pretty weird that the entire control group had positive PCRs on day 10 - in most other studies, a majority of people had negative PCRs by day 7 or so, regardless of whether they were control or placebo. Everything about this is so shoddy that I can easily believe something went wrong here. I don’t have a great understanding of this one but I don’t trust it at all. Luckily it is small and non-randomized so it will be easy to ignore going forward. I’m not saying this is related, but I’m not saying it *isn’t* related either. Carvallo et al: This one has all the disadvantages of Espitia-Hernandez, plus it’s completely unreadable. It’s hard to figure out how many patients there were, whether it was an RCT or not, etc. It looks like maybe there were 42 experimentals and 14 controls, and the controls were about 10x more likely to die than the experimentals. Seems pretty bad. On the other hand, another Carvallo paper was retracted because of fraud: apparently the hospital where the study supposedly took place said it never happened there. I can’t tell if this is a different version of that study, a pilot study for that study, or a different study by the same guy. Anyway, it’s too confusing to interpret, shows implausible results, and is by a known fraudster, so I feel okay about ignoring this one. Mahmud et al: RCT from Bangladesh. 200 patients received ivermectin plus doxycycline, 200 received placebo. Everything was written up very nicely in real English, by people who were clearly not on 34 lbs of meth at the time. They designated a primary outcome, “number of days required for clinical recovery”, and found a statistically significant difference at p < 0.001: Okay, fine, they misspelled “recovery” once. But they spelled it right the other time! That puts it in the top 50% for ivermectin papers! The fraud-hunters have examined this paper closely and are unable to find any signs of fraud. @PubPeer on the Mahmud trial of ivermectin in covid patients.\n\nI have now reviewed the individual patient data master sheet.\n\nI did not find any irregularities and the summary data matches the published data.\n\n","username":"K_Sheldrick","name":"Kyle Sheldrick","profile_image_url":"","date":"Sat Jul 17 11:06:25 +0000 2021","photos":[],"quoted_tweet":{},"reply_count":0,"retweet_count":2,"like_count":12,"impression_count":0,"expanded_url":{"url":"https://pubpeer.com/publications/E1D65711EF28D14517731BEACB89C8#2","title":"PubPeer - Ivermectin in combination with doxycycline for treating COVI...","description":"There are comments on PubPeer for publication: Ivermectin in combination with doxycycline for treating COVID-19 symptoms: a randomized trial (2021)","domain":"pubpeer.com"},"video_url":null,"belowTheFold":true}" data-component-name="Twitter2ToDOM"> I think this paper is legitimate and that its findings need to be seriously considered. Serious consideration doesn’t always meant they’re true - sometimes if we have strong evidence otherwise we can dismiss things without understanding why. And there’s always the chance it was a fluke, right? Can something have a p-value less than 0.001 and still be a fluke? Szenta Fonseca et al: This is a chart review from Brazil. Researchers looked at various people who had been treated for COVID in an insurance company database, saw whether they got ivermectin or not, and saw whether the people who got it did better or worse. About a hundred people got it, and a few hundred others didn’t. The people who got it did not do any better than anyone else, and you’ll notice this is one of the rare red boxes on the table above. But we shouldn’t take this study seriously. Nobody took any effort to avoid selection bias, so it’s very possible that sicker people were given more medication (including ivermectin), which unfairly handicaps the ivermectin group. Also, it’s hard to tell from the paper who was on how much of what, and the discussion of ivermectin seems like kind of an afterthought after discussing lots of other meds in much more depth. This is another one I feel comfortable ignoring. Cadegiani et al: A crazy person decided to put his patients on every weird medication he could think of, and 585 subjects ended up on a combination of ivermectin, hydroxychloroquine, azithromycin, and nitazoxanide, with dutasteride and spironolactone "optionally offered" and vitamin D, vitamin C, zinc, apixaban, rivaraxoban, enoxaparin, and glucocorticoids "added according to clinical judgment". There was no control group, but the author helpfully designated some random patients in his area as a sort-of-control, and then synthetically generated a second control group based on “a precise estimative based on a thorough and structured review of articles indexed in PubMed and MEDLINE and statements by official government agencies and specific medical societies”. Patients in the experimental group were twice as likely to recover (p < 0.0001), had negative PCR after 14 vs. 21 days, and had 0 vs. 27 hospitalizations. Speaking of low p-values, some people did fraud-detection tests on another of Cadegiani’s COVID-19 studies and got values like p < 8.24E-11 in favor of it being fraudulent. And, uh, he’s also studied whether ultra-high-dose antiandrogens treated COVID, and found that they did, cutting mortality by 92% . But the trial is under suspicion, with a BMJ article calling it “[the worst] violations of medical ethics and human rights in Brazil’s history” and “an ethical cesspit of violations”. [update 2022: this section originally contained more accusations against Cadegiani. Alexandros Marinos does a deeper dive with information not available at the time I wrote this, and finds some of them were overstated or false by implication] Anyway, let’s not base anything important on the results of this study, mmkay? A defiant Flavio Cadegiani. Imagine a guy who looks like this telling you to take ultra-high-dose antiandrogens. Ahmed et al: And we’re back in Bangladesh. 72 hospital patients were randomized to one of three arms: ivermectin only, ivermectin + doxycycline, and placebo. Primary endpoint was time to negative PCR, which was 9.7 days for ivermectin only and 12.7 days for placebo (p = 0.03). Other endpoints including duration of hospitalization (9.6 days ivermectin vs. 9.7 days placebo, not significant). This looks pretty good for ivermectin and does not have any signs of fraud or methodological problems. If I wanted to pick at it anyway, I would point out that the ivermectin + doxycycline group didn’t really differ from placebo, and that if you average out both ivermectin groups (with and without doxycycline) it looks like the difference would not be significant. I had previously committed to considering only ivermectin alone in trials that had multiple ivermectin groups, so I’m not going to do this. I can’t find any evidence this trial was preregistered so I don’t know whether they waited to see what would come out positive and then made that their primary endpoint, but virological clearance is a pretty normal primary endpoint and this isn’t that suspicious. It’s impossible to find any useful commentary on this study because Elgazzar (the guy who ran the most famous fraudulent ivermectin study) had the first name Ahmed, everyone is talking about Elgazzar all the time, and this overwhelms Google whenever I try to search for Ahmed et al. For now I’ll just keep this as a mildly positive and mildly plausible virological clearance result, in the context of no effect on hospitalization length or most symptoms. Chaccour et al: 24 patients in Spain were randomized to receive either medium-dose ivermectin or placebo. The primary outcome was percent of patients with negative PCR at day 7; secondary outcomes were viral load and symptoms. The primary endpoint ended up being kind of a wash - everyone still PCR positive by day 7 so it was impossible to compare groups. Ivermectin trended toward lower viral load but never reached significance. Weirdly, ivermectin did seem to help symptoms, but only anosmia and cough towards the end (p = 0.03), which you would usually think of as lingering post-COVID problems. The paper says: Given these findings, consideration could be given to alternative mechanisms of action different from a direct antiviral effect. One alternative explanation might be a positive allosteric modulation of the nicotinic acetylcholine receptor caused by ivermectin and leading to a downregulation of the ACE-2 receptor and viral entry into the cells of the respiratory epithelium and olfactory bulb. Another mechanism through which ivermectin might influence the reversal of anosmia is by inhibiting the activation of pro-inflammatory pathways in the olfactory epithelium. Inflammation of the olfactory mucosa is thought to play a key role in the development of anosmia in SARS-CoV-2 infection This seems kind of hedge-y. If you’re wondering where things went from there, Dr. Chaccour is now a passionate anti-ivermectin activist: @Finneganporter in @BusinessInsider \n\nThe roots of #ivermectin mania: How South America incubated a fake-medicine craze that took the US by storm\n\n","username":"carlos_chaccour","name":"Dr. Carlos Chaccour ??????","profile_image_url":"","date":"Sun Nov 07 18:40:28 +0000 2021","photos":[],"quoted_tweet":{},"reply_count":0,"retweet_count":2,"like_count":9,"impression_count":0,"expanded_url":{"url":"https://www.businessinsider.in/international/news/the-roots-of-ivermectin-mania-how-south-america-incubated-a-fake-medicine-craze-that-took-the-us-by-storm/articleshow/87554081.cms","image":"https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/88d08e70-c9e2-46d4-a5df-96807b6c3a13_2000x1000.jpeg","title":"The roots of ivermectin mania: How South America incubated a fake-medicine craze that took the US by storm","description":"The popularity of unproven anti-parasitic drug ivermectin as a COVID-19 treatment is surging. Its use has roots in South America, where it was hyped by populist","domain":"businessinsider.in"},"video_url":null,"belowTheFold":true}" data-component-name="Twitter2ToDOM"> So I guess he must think of this trial as basically negative, although realistically it’s 24 people and we shouldn’t put too much weight on it either way. Ghauri et al: Pakistan, 95 patients. Nonrandom; the study compared patients who happened to be given ivermectin (along with hydroxychloroquine and azithromycin) vs. patients who were just given the latter two drugs. There’s some evidence this produced systematic differences between the two groups - for example, patients in the control group were 3x more likely to have had diarrhea (this makes sense; diarrhea is a potential ivermectin side effect, so you probably wouldn’t give it to people already struggling with this problem). Also, the control group was twice as likely to be getting corticosteroids, maybe a marker for illness severity. Primary outcome was what percent of both groups had a fever: on day 7 it was 21% of ivermectin patients vs. 65% of controls, p < 0.001. No other outcomes were reported. I don’t hate this study, but I think the nonrandom assignment (and observed systematic differences) is a pretty fatal flaw. I can’t find anyone else talking about this one. At least no one seems to be saying anything bad. Babaloba et al: Be warned: if I have to refer to this one in real-life conversation, I will expand out the “et al” and call it “Babalola & Alakoloko”, because that’s really fun to say. This was a Nigerian RCT comparing 21 patients on low-dose ivermectin, 21 patients on high-dose ivermectin, and 20 patients on a combination of lopinavir and ritonavir, a combination antiviral which later studies found not to work for COVID and which might as well be considered a placebo. Primary outcome, as usual, was days until a negative PCR test. High dose ivermectin was 4.65 days, low dose was 6 days, control was 9.15, p = 0.035. Figure 2 is apparently a photograph of the computer screen where they did this calculation. Gideon Meyerowitz-Katz, part of the team that detects fraud in ivermectin papers, is not a fan of this one: He doesn’t say there what means, but elsewhere he tweets this figure: It’s always a bad sign when your study features in an image with “NUMEROUS IMPOSSIBLE NUMBERS” in red at the top. I think his point is that if you have 21 people, it’s impossible to have 50% of them have headache, because that would be 10.5. If 10 people have a headache, it would be 47.6%; if 11, 52%. So something is clearly wrong here. Seems like a relatively minor mistake, and Meyerowitz-Katz stops short of calling fraud, but it’s not a good look. I’m going to be slightly uncomfortable with this study without rejecting it entirely, and move on. Ravakirti et al: Here we’re in Eastern India - not exactly Bangladesh again, but a stone’s throw away from it. In this RCT patients were randomized into an ivermectin group (57) and a placebo group (58). Primary outcome was negative PCR on day 6, because doing it on day 7 like everyone else would be too easy. As with several other groups, this was a bad move; too few people had it to make a good comparison; it was 13% of intervention vs. 18% of placebo, p = 0.3. Secondary outcomes were also pretty boring, except for the most important: 4 people in the placebo group died, compared to 0 in ivermectin (p = 0.045). On the one hand, this is one outcome of many, reaching the barest significance threshold. Another fluke? Still, there are no real problems with this study, and nobody has anything to say against it. Let’s add this one to the scale as another very small and noisy piece of real evidence in ivermectin’s favor. Bukhari et al: Now we’re in Pakistan. 50 patients were randomized to low-dose ivermectin, another 50 got standard of care including vitamin D. There was no placebo, but primary outcome was number of days to reach negative PCR, which it seems hard for placebo to affect much, so I don’t care. 5 controls and 9 ivermectin patients left the hospital against medical advice and could not be followed up, which is bad but not necessarily study-ruining. They never measured their supposed primary outcome of “days to reach negative PCR” directly, but they did measure how many people had negative PCR on various days, and ivermectin had a clear advantage - for example, on day 7, it was 37/50 for IVR and only 20/50 for control. Even if we assume all the lost-to-followup patients had maximally bad-for-the-hypothesis results, that’s still a positive finding. Nobody else has much to say about this one, certainly no accusations that they’ve found anything suspicious. Keep. Mohan et al: India. RCT. 40 patients got low-dose ivermectin, 40 high-dose ivermectin, and 45 placebo. Primary outcomes were time to negative PCR, and viral load on day 5. In the results, they seem to have reinterpreted “time to negative PCR” as the subtly different “percent with negative PCR on some specific day”. High-dose ivermectin did best (47.5% negative on day 5) and placebo worst (31% negative), but it was insignificant (p = 0.3). There was no difference in viral load. All groups took about the same amount of time for symptoms to resolve. More placebo patients had failed to recover by the end of the study (6) than ivermectin patients (2), but this didn’t reach statistical significance (p = 0.4). Overall a well-done, boring, negative study, although ivermectin proponents will correctly point out that, like basically every other study we have looked at, the trend was in favor of ivermectin and this could potentially end up looking impressive in a meta-analysis. Biber et al: This is an RCT from Israel. 47 patients got ivermectin and 42 placebo. Primary endpoint was viral load on day 6. I am having trouble finding out what happened with this; as far as I can tell it was a negative result and they buried it in favor of more interesting things. In a "multivariable logistic regression model, the adjusted odds ratio of negative SARS-CoV-2 RT-PCR negative test" favored ivermectin over placebo (p = 0.03 for day 6, p = 0.01 for day 8), but this seems like the kind of thing you do when your primary outcome is boring and you’re angry. Gideon Meyerowitz-Katz is not a fan: He notes that the study excluded people with high viral load, but the preregistration didn’t say they would do that. Looking more closely, he finds they did that because, if you included these people, the study got no positive results. So probably they did the study, found no positive results, re-ran it with various subsets of patients until they did get a positive result, and then claimed to have “excluded” patients who weren’t in the subset that worked. I’m going to toss this one. Elalfy et al: What even is this? Where am I? As best I can tell, this is some kind of Egyptian trial. It might or might not be an RCT; it says stuff like “Patients were self-allocated to the treatment groups; the first 3 days of the week for the intervention arm while the other 3 days for symptomatic treatment”. Were they self-allocated in the sense that they got to choose? Doesn’t that mean it’s not random? Aren’t there seven days in a week? These are among the many questions that Elalfy et al do not answer for us. The control group (which they seem to think can also be called “the white group”) took zinc, paracetamol, and maybe azithromycin. The intervention group took zinc, nitazoxanide, ribavirin, and ivermectin. There were very large demographic differences between the groups of the sort which make the study unusable, which they mention and then ignore. From there, they follow this normal and totally comprehensible flowchart: There is no primary outcome assigned, but viral clearance rates on day seven were 58% in the yellow group compared to 0% in the white group, which I guess is a strong positive result. This table… …looks very impressive, in terms of the experimental group doing better than the control, except that they don’t specify whether it was before the trial or after it, and at least one online commentator thinks it might have been before, in which case it’s only impressive how thoroughly they failed to randomize their groups. Overall I don’t feel bad throwing this study out. I hope it one day succeeds in returning to its home planet. Lopez-Medina et al: Colombian RCT. 200 patients took ivermectin, another 200 took placebo. They originally worried the placebo might taste different than real ivermectin, then solved this by replacing it with a different placebo, which is a pretty high level of conscientiousness. Primary outcome was originally percent of patients whose symptoms worsened by two points, as rated on a complicated symptom scale when a researcher asked them over the phone. Halfway through the study, they realized nobody was worsening that much, so they changed the primary outcome to time until symptoms got better, as measured by the scale. In the ivermectin group, symptoms improved that much after 10 days; in the placebo group, after 12, p = 0.53. By the end of the study, symptoms had improved in 82% of ivermectin users and 79% of controls, also insignificant. 4 patients in the ivermectin group needed to be hospitalized compared to 6 in the placebo group, again insignificant. This study is bigger than most of the other RCTs, and more polished in terms of how many spelling errors, photographs of computer screens, etc, it contains. It was published in JAMA, one of the most prestigious US medical journals, as opposed to the crappy nth-tier journals most of the others have been in. When people say things like “sure, a lot of small studies show good results for ivermectin, but the bigger and more professional trials don’t”, this is one of the two big professional trials they’re talking about. Ivermectin proponents make some good arguments against it. In order to get as big as it did, Lopez-Medina had to compromise on rigor. Its outcome is how people self-score their symptoms on a hokey scale in a phone interview, instead of viral load or PCR results or anything like that. Still, this is basically what we want, right? In the end, we want people to feel better and less sick, not to get good scores on PCR tests. Also, it changed its primary outcome halfway through; isn’t that bad? I think maybe not; the reason we want a preregistered primary outcome is so that you don’t change halfway through to whatever outcome shows the results you want. The researchers in this study did a good job explaining why they changed their outcome, the change makes sense, and their original outcome would also have shown ivermectin not working (albeit less accurately and effectively). I don’t know of any evidence that they knew (or suspected) final results when switching to this new outcome, and it seems like the most reasonable new outcome to switch to. Finally, their original placebo tasted different from ivermectin (though they switched halfway through). This is one of the few studies where I actually care about placebo, because people are self-rating their symptoms. But realistically most of these people don’t know what ivermectin is supposed to taste like. Also, they did a re-analysis and found there was no difference between the people who got the old placebo and the new one. I’m making a big deal of this because ivmmeta.com - the really impressive meta-analysis site I’ve been going off of - puts a special warning letter underneath their discussion of this study, urging us not to trust it. They don’t do this for any of the other ones we’ve addressed so far - not the one by the guy whose other studies were all frauds, not the one where 50% of 21 people had headaches, not the unrandomized one where the groups were completely different before the experiment started, not even the one by the guy accused of crimes against humanity. Only this one. This makes me a lot less charitable to ivmmeta than I would otherwise be; I think it’s hard to choose this particular warning letter strategy out of well-intentioned commitment to truth. They just really don’t like this big study that shows ivermectin doesn’t work. Also, the warning itself irritates me, and includes paragraphs like: RCTs have a fundamental bias against finding an effect for interventions that are widely available — patients that believe they need treatment are more likely to decline participation and take the intervention [Yeh], i.e., RCTs are more likely to enroll low-risk participants that do not need treatment to recover (this does not apply to the typical pharmaceutical trial of a new drug that is otherwise unavailable). This trial was run in a community where ivermectin was available OTC and very widely known and used. Nobody else worries about this, and there are a million biases that non-randomized studies have that would be super-relevant when discussing those, but somehow when they’re pro-ivermectin the site forgets to be this thorough. I think a better pro-ivermectin response to this study is to point out that all the trends support ivermectin. Symptoms took 10 days to resolve in the ivermectin group vs. 12 in placebo; 4 ivermectin patients were hospitalized vs. 6 placebo patients, etc. Just say that this was an unusually noisy trial because of the self-report methodology, and you’re confident that these small differences will add up to significance when you put them into a meta-analysis. Roy et al: We’re back in East India, and back to non-randomized trials. 56 patients were retrospectively examined; some had been given ivermectin + doxycycline, others hydroxychloroquine, other azithromycin, and others symptomatic treatment only. We don’t get any meaningful information about how this worked, but we are told that they did not differ in “clinical well-being reporting onset timing”. Whatever. Chahla et al: The first of many Argentine trials. 110 patients received medium-dose ivermectin; 144 were kept as a control (no placebo). This was “cluster randomized”, which means they randomize different health centers to either give the experimental drug or not. This is worse than regular randomization, because there could be differences between these health centers (eg one might have better doctors who otherwise give better treatment, one might be in the poor part of town and have sicker patients, etc). They checked to see if there were any differences between the groups, and it sure looks like there were (the experimental group had twice as many obese people as the controls), but as per them, these differences were not statistically significant. Note that if this did make a difference, it would presumably make ivermectin look worse, not better. The primary outcome was given as “increase discharge from outpatient care with COVID-19 mild disease”. This favored the treatment; only 2/110 patients in the ivermectin group failed to be discharged, compared to 20 patients in the control group. But, uh, these were at different medical centers. Can’t different medical centers just have different discharge policies? One discharges you as soon as you seem to be getting better, the other waits to really make sure? This is an utterly crap endpoint to do a cluster randomized controlled trial on. If you’re going to do cRCT, which is never a great idea, you should be using some extremely objective endpoint that doctors and clinic administrators can’t possibly affect, like viral load according to some third-party laboratory, using the same third-party laboratory for both clinics. This is such a bad idea that I can’t help worrying I’m missing or misunderstanding something. If not, this is dumb and bad and should be ignored. Mourya et al: We’re back in India. This is a nonrandomized study comparing 50 patients given ivermectin to 50 patients given hydroxychloroquine. No primary outcome was named, but they focus on PCR negativity. Only 6% of patients in the hydroxychloroquine group were negative, compared to 90% of patients in the ivermectin group! On what day did they do the test? Uh, kind of random, and they admit that “in [the hydroxychloroquine group], mean time difference from the date of initiation of treatment and second test was significantly longer (7.24±2.75 days) as compared to 5.22±1.21 days in [the ivermectin group] (p=0.021).” Since they assessed these groups at different times, we shouldn’t draw any conclusions from them getting different results. Except that as far as I can tell this should handicap ivermectin, making it especially impressive that it did better. But also, the ivermectin group was made mostly of people who had been asymptomatic at the beginning (70%), and the hydroxychloroquine group had almost no asymptomatic cases (8%) . They were giving the ivermectin to healthy people and the hydroxychloroquine to sick people! They admit deep in the discussion that this “may be a confounding factor”. So basically they got totally different groups of people, tested them at totally different times, and the two sets of test results differed. So what? So this is why normal people do RCTs instead of whatever the heck this is, that’s what. Loue et al: …this one isn’t going to be an RCT either. Loue tells a story about a cluster of COVID cases at the French nursing home where he works. He asked people if they wanted to try ivermectin; 10 did and 15 didn’t. 1 ivermectin patient died, compared to 5 non-ivermectin patients. The non-ivermectin group looked a bit sicker than the ivermectin group in the inevitable Table 1, though it’s hard to tell. One interesting possible confounder (not mentioned, but I’m imagining it) is that demented patients probably couldn’t consent to ivermectin and ended up in the control group. This is another case of “I’m not going to trust anything that isn’t an RCT”. Merino et al: Another (sigh) non-RCT. Mexico City tried a public health program where if you called a hotline and said you had COVID, they sent you an emergency kit with various useful supplies. One of those supplies was ivermectin tablets. 18,074 people got the kit (and presumably some appreciable fraction took the ivermectin, though there’s no way to prove that). Their control group is people from before they started giving out the kits, people from after they stopped giving out the kits, and people who didn’t want the kits. There are differences in who got COVID early in the epidemic vs. later, and in people who did opt for medical kits vs. didn’t. To correct these, the researchers tried to adjust for confounders, something which - as I keep trying to hammer home again and again - never works. They found that using the kit led to a 75% or so reduction in hospitalization, though they were unable to separate out the ivermectin from the other things in the kit (paracetamol and aspirin), or from the placebo effect of having a kit and feeling like you had already gotten some treatment (if I understand right, the decision to go to the hospital was left entirely to the patient). I think this study is a moderate point in favor of giving people kits in order to prevent hospital overcrowding, but I’m not willing to accept that it tells us much about ivermectin in particular. Faisal et al: This one was published in The Professional Medical Journal (mispelled as “Profesional Medical Journal” in its URL), so you know it’s going to be good! It describes itself as “a cross-sectional study”, but later says it “randomized patients into two groups”, which would make it an RCT - I think they might just be using the term “cross-sectional” different from the standard American usage. A hospital in Pakistan got 50 patients on ivermectin + azithromycin, and another 50 on azithromycin alone. Primary outcome was not mentioned, and the data were presented confusingly, but a typical result is that only 4% of the ivermectin group had symptoms lasting more than 10 days, whereas 16% of the control group did, p < 0.01. They do a really weird thing where they compare how long it took symptoms to resolve between IVM and control groups within each bin. That is, if I’m understanding correctly, they ask “of the people who took between 3-5 days for symptoms to resolve, did they resolve faster for IVM or control?”. This is an utterly bizarre analysis to perform, although it doesn’t affect the fact that their other results still seem to favor ivermectin. Maybe I’m confused about what’s going on here. I’ve mostly been letting people off easy on no placebo, but I as far as I can tell (not very far) this paper seems to be going off whether patients reported continuing to have symptoms to the hospital doing the study, and I think that is potentially susceptible to placebo effects. Additionally, there’s no preregistration, and even though they talk a lot about doing PCR tests they don’t present the results. This is by no means the worst study here but I still think it’s pretty low quality and I don’t trust it. Aref et al: This one is published in the International Journal Of Nanomedicine, even though I’m pretty sure that isn’t a real thing. In this case the “nanomedicine” is a new nasal spray version of ivermectin which is so confusing I cannot for the life of me figure out what dose they are giving these patients. This Egyptian study gives 57 patients intranasal ivermectin plus hydroxychloroquine, azithromycin, oseltamavir, and some vitamins; another 57 patients get all that stuff except the ivermectin. Primary outcome is not stated, but they look at various symptoms, all of which look better in the ivermectin group: 95% of ivermectin patients got negative PCRs at some time point, compared to 75% of controls, p = 0.004. I am pretty suspicious of this study, not least because it comes from Egypt which has an awful reputation for fake studies, and it returns extreme results that I wouldn’t expect even if ivermectin was actually a wonder drug. But I cannot find any particular thing wrong with it, nor did anyone else I looked at, so I will grudgingly let it stand. Krolewiecki et al: Another Argentine study. This one is a real RCT. 30 patients received ivermectin, 15 were the control group (no placebo, again). Primary outcome was difference in viral load on day 5. The trend favored ivermectin but it was not statistically significant, although they were able to make it statistically significant if they looked at a subset of higher-IVM-plasma-concentration patients. They did not find any difference in clinical outcomes. A pro-ivermectin person could point out that in the subgroup with the highest ivermectin concentrations, the drug seemed to work. A skeptic could point out that this is exactly the kind of subgroup slicing that you are not supposed to do without pre-registering it, which I don’t think this team did. I agree with the skeptic. Vallejos et al: Another Argentine study. It’s big (250 people in each arm). It’s an RCT. It tries to define a primary outcome (“Primary outcome: the trial ended when the last patient who was included achieved the end of study visit”), but that’s not what “primary outcome” means, and they don’t offer an alternative. Other outcomes: no difference in PCR on days 3 or 12. Hospitalization is nonsignificantly better in the ivermectin group (14 vs. 21, p = 0.2), but death is nonsigificantly better in the placebo group (3 vs. 4, p = 0.7). This isn’t even the kind of nonsignificant that might contribute to an exciting meta-analysis later. This is just a pure null result. I cannot find any problem with this study, and neither can anyone else I checked. This is the biggest RCT we’ve seen so far, so we should take it seriously. TOGETHER Trial: Speaking of big RCTs… This one hasn’t been published yet. There’s a video of a talk about it, but I am not going to watch it, because it is a video, so I am getting information secondhand from eg here. Apparently, it compares 677 people (!) randomized to ivermectin to 678 people randomized to placebo. 86 ivermectin patients ended up in the hospital compared to 95 placebo patients, p-value not significant. This was a really big professional trial done by bigshot researchers from a major Canadian university, and the medical establishment is taking it much more seriously than any of these others. When it comes out, it will probably get published in a top journal. When discussing Lopez-Medina, I wrote: When people say things like “sure, a lot of small studies show good results for ivermectin, but the bigger and more professional trials don’t”, this is one of the two big professional trials they’re talking about. This is the other one. Not coincidentally, it’s also the other trial that ivmmeta.com has a warning letter underneath telling you to disregard. Their main concern is that instead of truly randomizing patients to ivermectin vs. placebo, they did a time-dependent randomization that meant during some weeks more patients were getting one or the other. This is a problem because the trial takes place in Brazil, where different variants were more common at different times. Here’s their image: On the one hand, I have immense contempt for ivmmeta for letting all those other awful studies pass and then pulling out all the stops to try to nitpick this one. I have no idea if their proposed randomization failure really happened. And no doubt the reason they’re even able to investigate this is that this study is really careful and transparent - most of them don’t tell you anything about their randomization method. I would be shocked if other studies don’t have all these problems and worse. On the other hand, the point isn’t to be fair, it’s to be right. And this is a potential confounder. Not a huge one. But a potential one. I guess all we can do is try to bound the damage. Even if the confounding is 100% real and bad, there’s no way to make this study consistent with the crazy super-pro-ivermectin results of studies like Espitia-Hernandez and Aref. And even if we deny any confounding, we see the same slight pro-ivermectin trend - 86 hospitalizations vs. 95 - that we’ve seen in so many other studies. Nothing is going to make me believe that this isn’t in the top 33% of studies we’ve been looking at, so let’s add it as grist for the meta-analysis (though maybe not quite as much grist as its vast size indicates) and move on, angrily. Buonfrate et al: An Italian RCT. Patients were randomized into low-dose ivermectin (32), placebo (29), or high-dose ivermectin (32). Primary outcome was viral load on day 7. There was no significant difference (average of 2 in ivermectin groups, 2.2 in placebo group). They admit that they failed to reach the planned sample size, but did a calculation to show that even if they had, the trial could not have returned a positive result. Clinically, an average of 2 patients were hospitalized in each of the ivermectin arms, compared to 0 in the placebo arm - which bucks our previously-very-constant pro-ivermectin trend. Mayer et al: Not an RCT. Patients in an Argentine province were offered the opportunity to try ivermectin; 3266 said yes and become the experimental group, 17966 said no and became the control group. There were many obvious differences between the groups, but they all seemed to handicap ivermectin. There was a nonsignificant trend toward less hospitalization and significantly less mortality (1.5% vs. 2.1%, p = 0.03). While looking into this study, I learned the term “immortal time bias”. This means a period in between selection for the study and the beginning of study recording where patient outcomes are not counted. I think the problem here is that if you signed up for the system on Day X, and if you got sick before they could give you ivermectin, you were in the control group. See this Twitter thread, I have not confirmed everything he says. This only hardens my resolve to stay away from non-RCTs. Borody et al: Our last paper! …is it a paper? I can’t find it published anywhere. It mostly seems to be on news sites. Doesn’t look peer-reviewed. And it starts with “Note that views expressed in this opinion article are the writer’s personal views”. Whatever. 600 Australians were treated with ivermectin, doxycycline, and zinc. The article compares this to an “equivalent control group” made of “contemporary infected subjects in Australia obtained from published Covid Tracking Data”; this is not how you control group, @#!% you. Then it gets excited about the fact that most patients had better symptoms at the end of the ten-day study period than the beginning (untreated COVID resolves in about ten days). Why are these people wasting my time with this? Let’s move on. The Analysis If we remove all fraudulent and methodologically unsound studies from the table above, we end up with this: Gideon Meyerowitz-Katz, who investigated many of the studies above for fraud, tried a similar exercise. I learned about his halfway through, couldn’t help seeing it briefly, but tried to avoid remembering it or using it when generating mine (also, I did take the result of his fraud investigations into account), so they should be considered not quite independent efforts. His looks like this: He nixed Chowdhury, Babaloba, Ghauri, Faisal, and Aref, but kept Szenta Fonseca, Biber (?), and Mayer. There was correlation of 0.45, which I guess is okay. I asked him about his decision-making, and he listed a combination of serious statistical errors and small red flags adding up. I was pretty uncomfortable with most of these studies myself, so I will err on the side of severity, and remove all studies that either I or Meyerowitz-Katz disliked. We end up with the following short list: We’ve gone from 29 studies to 11, getting rid of 18 along the way. For the record, we eliminated 2/19 for fraud, 1/19 for severe preregistration violations, 10 for methodological problems, and 6 because Meyerowitz-Katz was suspicious of them. …but honestly this table still looks pretty good for ivermectin, doesn’t it? Still lots of big green boxes. Meyerowitz-Katz accuses ivmmeta of cherry-picking what statistic to use for their forest plot. That is, if a study measures ten outcomes, they sometimes take the most pro-ivermectin outcome. Ivmmeta.com counters that they used a consistent and reasonable (if complicated) process for choosing their outcome of focus, that being: If studies report multiple kinds of effects then the most serious outcome is used in calculations for that study. For example, if effects for mortality and cases are both reported, the effect for mortality is used, this may be different to the effect that a study focused on. If symptomatic results are reported at multiple times, we used the latest time, for example if mortality results are provided at 14 days and 28 days, the results at 28 days are used. Mortality alone is preferred over combined outcomes. Outcomes with zero events in both arms were not used (the next most serious outcome is used — no studies were excluded). For example, in low-risk populations with no mortality, a reduction in mortality with treatment is not possible, however a reduction in hospitalization, for example, is still valuable. Clinical outcome is considered more important than PCR testing status. When basically all patients recover in both treatment and control groups, preference for viral clearance and recovery is given to results mid-recovery where available (after most or all patients have recovered there is no room for an effective treatment to do better). If only individual symptom data is available, the most serious symptom has priority, for example difficulty breathing or low SpO2 is more important than cough. I’m having trouble judging this, partly because Meyerowitz-Katz says ivmmeta has corrected some earlier mistakes, and partly because there really is some reasonable debate over how to judge studies with lots of complicated endpoints. By this point I had completely forgotten what ivmmeta did, so I independently coded all 11 remaining studies following something in between my best understanding of their procedure and what I considered common sense. The only exception was that when the most severe outcome was measured in something other than patients (ie average number of virus copies per patient), I defaulted to one that was measured in patients instead, to keep everything with the same denominator. My results mostly matched ivmmeta’s, with one or two exceptions that I think are within the scope of argument or related to my minor deviations from their protocol. Placebo vs. ivermectin groups sometimes differed in size, which I’ve adjusted for and rounded off. Probably I’m forgetting some reason I can’t just do simple summary statistics to this, but whatever. It is p = 0.15, not significant. This is maybe unfair, because there aren’t a lot of deaths in the sample, so by focusing on death rather than more common outcomes we’re pointlessly throwing away sample size. What happens if I unprincipledly pick whatever I think the most reasonable outcome to use from each study is? I’ve chosen “most reasonable” as a balance between “is the most severe” and “has a lot of data points”: Now it’s p = 0.04, seemingly significant, but I had to make some unprincipled decisions to get there. I don’t think I specifically replaced negative findings with positive ones, but I can’t prove that even to myself, let alone to you. [UPDATE 5/31/22: A reader writes in to tell me that the t-test I used above is overly simplistic. A Dersimonian-Laird test is more appropriate for meta-analysis, and would have given 0.03 and 0.005 on the first and second analysis, where I got 0.15 and 0.04. This significantly strengthens the apparent benefit of ivermectin from ‘debatable’ to ‘clear’. I discuss some reasons below why I am not convinced by this apparent benefit.] (how come I’m finding a bunch of things on the edge of significance, but the original ivmmeta site found a lot of extremely significant things? Because they combined ratios, such that “one death in placebo, zero in ivermectin” looked like a nigh-infinite benefit for ivermectin, whereas I’m combining raw numbers. Possibly my way is statistically illegitimate for some reason, but I’m just trying to get a rough estimate of how convinced to be) So we are stuck somewhere between “nonsignificant trend in favor” and “maybe-significant trend in favor, after throwing out some best practices”. This is normally where I would compare my results to those of other meta-analyses made by real professionals. But when I look at them, they all include studies later found to be fake, like Elgazzar, and unsurprisingly come up with wildly positive conclusions. There are about six in this category. One of them later revised their results to exclude Elgazzar and still found strong efficacy for ivermectin, but they still included Niaee and some other dubious studies. The only meta-analysis that doesn’t make these mistakes is Popp (a Cochrane review), which is from before Elgazzar was found to be fraudulent, but coincidentally excludes it for other reasons. It also excludes a lot of good studies like Mahmud and Ravakirti because they give patients other things like HCQ and azithromycin - I chose to include them, because I don’t think they either work or have especially bad side effects, so they’re basically placebo - but Cochrane is always harsh like this. They end up with a point estimate where ivermectin cuts mortality by 40% - but say the confidence intervals are too wide to draw any conclusion. I think this basically agrees with my analyses above - the trends really are in ivermectin’s favor, but once you eliminate all the questionable studies there are too few studies left to have enough statistical power to reach significance. Except that everyone is still focusing on deaths and hospitalizations just because they’re flashy. Mahmud et al, which everyone agrees is a great study, found that ivermectin decreased days until clinical recovery, p = 0.003? So what do you do? This is one of the toughest questions in medicine. It comes up again and again. You have some drug. You read some studies. Again and again, more people are surviving (or avoiding complications) when they get the drug. It’s a pattern strong enough to common-sensically notice. But there isn’t an undeniable, unbreachable fortress of evidence. The drug is really safe and doesn’t have a lot of side effects. So do you give it to your patients? Do you take it yourself? Here this question is especially tough, because, uh, if you say anything in favor of ivermectin you will be cast out of civilization and thrown into the circle of social hell reserved for Klan members and 1/6 insurrectionists. All the health officials in the world will shout “horse dewormer!” at you and compare you to Josef Mengele. But good doctors aren’t supposed to care about such things. Your only goal is to save your patient. Nothing else matters. I am telling you that Mahmud et al is a good study and it got p = 0.003 in favor of ivermectin. You can take the blue pill, and stay a decent respectable member of society. Or you can take the horse dewormer pill, and see where you end up. In a second, I’ll tell you my answer. But you won’t always have me to answer questions like this, and it might be morally edifying to observe your thought process in situations like this. So take a second, and meet me on the other side of the next section heading. … … … … … The Synthesis Hopefully you learned something interesting about yourself there. But my answer is: worms! As several doctors and researchers have pointed out (h/t especially Avi Bitterman and David Boulware), the most impressive studies come from places that are teeming with worms. Mahmud from Bangladesh, Ravakirti from East India, Lopez-Medina from Colombia, etc. Here’s the prevalence of roundworm infections by country (source). But alongside roundworms, there are threadworms, hookworms, blood flukes, liver flukes, nematodes, trematodes, all sorts of worms. Add them all up and somewhere between half and a quarter of people in the developing world have at least one parasitic worm in their body. Being full of worms may impact your ability to fight coronavirus. Gluchowska et al write: Helminth [ie worm] infections are among the most common infectious diseases. Bradbury et al. highlight the possible negative interactions between helminth infection and COVID-19 severity in helminth-endemic regions and note that alterations in the gut microbiome associated with helminth infection appear to have systemic immunomodulatory effects. It has also been proposed that helminth co-infection may increase the morbidity and mortality of COVID-19, because the immune system cannot efficiently respond to the virus; in addition, vaccines will be less effective for these patients, but treatment and prevention of helminth infections might reduce the negative effect of COVID-19. During millennia of parasite-host coevolution helminths evolved mechanisms suppressing the host immune responses, which may mitigate vaccine efficacy and increase severity of other infectious diseases. Treatment of worm infections might reduce the negative effect of COVID-19! And ivermectin is a deworming drug! You can see where this is going… The most relevant species of worm here is the roundworm Strongyloides stercoralis. Among the commonest treatments for COVID-19 is corticosteroids, a type of immunosuppresant drug. The types of immune responses it suppresses do more harm than good in coronavirus, so turning them off limits collateral damage and makes patients better on net. But these are also the types of immune responses that control Strongyloides. If you turn them off even very briefly, the worms multiply out of control, you get what’s called “Strongyloides hyperinfection”, and pretty often you die. According to the WHO: The current COVID-19 pandemic serves to highlight the risk of using systemic corticosteroids and, to a lesser extent, other immunosuppressive therapy, in populations with significant risk of underlying strongyloidiasis. Cases of strongyloidiasis hyperinfection in the setting of corticosteroid use as COVID-19 therapy have been described and draw attention to the necessity of addressing the risk of iatrogenic strongyloidiasis hyperinfection syndrome in infected individuals prior to corticosteroid administration. Although this has gained importance in the midst of a pandemic where corticosteroids are one of few therapies shown to improve mortality, its relevance is much broader given that corticosteroids and other immunosuppressive therapies have become increasingly common in treatment of chronic diseases (e.g. asthma or certain rheumatologic conditions). So you need to “address the risk” of strongyloides infection during COVID treatment in roundworm-endemic areas. And how might you address this, WHO? Treatment of chronic strongyloidiasis with ivermectin 200 µg/kg per day orally x 1-2 days is considered safe with potential contraindications including possible Loa loa infection (endemic in West and Central Africa), pregnancy, and weight <15kg. Given ivermectin’s safety profile, the United States has utilized presumptive treatment with ivermectin for strongyloidiasis in refugees resettling from endemic areas, and both Canada and the European Centre for Disease Prevention and Control have issued guidance on presumptive treatment to avoid hyperinfection in at risk populations. Screening and treatment, or where not available, addition of ivermectin to mass drug administration programs should be studied and considered. This is serious and common enough that, if you’re not going to screen for it, it might be worth “add[ing] ivermectin to mass drug administration programs” in affected areas! Dr. Avi Bitterman carries the hypothesis to the finish line: First two images are with all relevant studies; second two are a sensitivity analysis that removes some of the most dubious. The good ivermectin trials in areas with low Strongyloides prevalence, like Vallejos in Argentina, are mostly negative. The good ivermectin trials in areas with high Strongyloides prevalence, like Mahmud in Bangladesh, are mostly positive. Worms can’t explain the viral positivity outcomes (ie PCR), but Dr. Bitterman suggests that once you remove low quality trials and worm-related results, the rest looks like simple publication bias: This is still just a possibility. Maybe I’m over-focusing too hard on a couple positive results and this will all turn out to be nothing. Or who knows, maybe ivermectin does work against COVID a little - although it would have to be very little, fading to not at all in temperate worm-free countries. But this theory feels right to me. It feels right to me because it’s the most troll-ish possible solution. Everybody was wrong! The people who called it a miracle drug against COVID were wrong. The people who dismissed all the studies because they F@#king Love Science were wrong. Ivmmeta.com was wrong. Gideon Meyerowitz-Katz was…well, he was right, actually, I got the worm-related meta-analysis graphic above from his Twitter timeline. Still, an excellent troll. Also, the best part is that I ignorantly asked, in my description of Mahmud et al above: And it was! It was a fluke! A literal, physical, fluke! For my whole life, God has been placing terrible puns in my path to irritate me, and this would be the worst one ever! So it has to be true! The Scientific Takeaway About ten years ago, when the replication crisis started, we learned a certain set of tools for examining studies. Check for selection bias. Distrust “adjusting for confounders”. Check for p-hacking and forking paths. Make teams preregister their analyses. Do forest plots to find publication bias. Stop accepting p-values of 0.049. Wait for replications. Trust reviews and meta-analyses, instead of individual small studies. These were good tools. Having them was infinitely better than not having them. But even in 2014, I was writing about how many bad studies seemed to slip through the cracks even when we pushed this toolbox to its limits. We needed new tools. I think the methods that Meyerowitz-Katz, Sheldrake, Heathers, Brown, Lawrence and others brought to the limelight this year are some of the new tools we were waiting for. Part of this new toolset is to check for fraud. About 10 - 15% of the seemingly-good studies on ivermectin ended up extremely suspicious for fraud. Elgazzar, Carvallo, Niaee, Cadegiani, Samaha. There are ways to check for this even when you don’t have the raw data. Like: The Carlisle-Stouffer-Fisher method: Check some large group of comparisons, usually the Table 1 of an RCT where they compare the demographic characteristics of the control and experimental groups, for reasonable p-values. Real data will have p-values all over the map; one in every ten comparisons will have a p-value of 0.1 or less. Fakers seem bad at this and usually give everything a nice safe p-value like 0.8 or 0.9.
Here’s the prevalence of roundworm infections by country (source). But alongside roundworms, there are threadworms, hookworms, blood flukes, liver flukes, nematodes, trematodes, all sorts of worms. Add them all up and somewhere between half and a quarter of people in the developing world have at least one parasitic worm in their body. Being full of worms may impact your ability to fight coronavirus. Gluchowska et al write: Helminth [ie worm] infections are among the most common infectious diseases. Bradbury et al. highlight the possible negative interactions between helminth infection and COVID-19 severity in helminth-endemic regions and note that alterations in the gut microbiome associated with helminth infection appear to have systemic immunomodulatory effects. It has also been proposed that helminth co-infection may increase the morbidity and mortality of COVID-19, because the immune system cannot efficiently respond to the virus; in addition, vaccines will be less effective for these patients, but treatment and prevention of helminth infections might reduce the negative effect of COVID-19. During millennia of parasite-host coevolution helminths evolved mechanisms suppressing the host immune responses, which may mitigate vaccine efficacy and increase severity of other infectious diseases. Treatment of worm infections might reduce the negative effect of COVID-19! And ivermectin is a deworming drug! You can see where this is going… The most relevant species of worm here is the roundworm Strongyloides stercoralis. Among the commonest treatments for COVID-19 is corticosteroids, a type of immunosuppresant drug. The types of immune responses it suppresses do more harm than good in coronavirus, so turning them off limits collateral damage and makes patients better on net. But these are also the types of immune responses that control Strongyloides. If you turn them off even very briefly, the worms multiply out of control, you get what’s called “Strongyloides hyperinfection”, and pretty often you die. According to the WHO: The current COVID-19 pandemic serves to highlight the risk of using systemic corticosteroids and, to a lesser extent, other immunosuppressive therapy, in populations with significant risk of underlying strongyloidiasis. Cases of strongyloidiasis hyperinfection in the setting of corticosteroid use as COVID-19 therapy have been described and draw attention to the necessity of addressing the risk of iatrogenic strongyloidiasis hyperinfection syndrome in infected individuals prior to corticosteroid administration. Although this has gained importance in the midst of a pandemic where corticosteroids are one of few therapies shown to improve mortality, its relevance is much broader given that corticosteroids and other immunosuppressive therapies have become increasingly common in treatment of chronic diseases (e.g. asthma or certain rheumatologic conditions). So you need to “address the risk” of strongyloides infection during COVID treatment in roundworm-endemic areas. And how might you address this, WHO? Treatment of chronic strongyloidiasis with ivermectin 200 µg/kg per day orally x 1-2 days is considered safe with potential contraindications including possible Loa loa infection (endemic in West and Central Africa), pregnancy, and weight <15kg. Given ivermectin’s safety profile, the United States has utilized presumptive treatment with ivermectin for strongyloidiasis in refugees resettling from endemic areas, and both Canada and the European Centre for Disease Prevention and Control have issued guidance on presumptive treatment to avoid hyperinfection in at risk populations. Screening and treatment, or where not available, addition of ivermectin to mass drug administration programs should be studied and considered. This is serious and common enough that, if you’re not going to screen for it, it might be worth “add[ing] ivermectin to mass drug administration programs” in affected areas! Dr. Avi Bitterman carries the hypothesis to the finish line: First two images are with all relevant studies; second two are a sensitivity analysis that removes some of the most dubious. The good ivermectin trials in areas with low Strongyloides prevalence, like Vallejos in Argentina, are mostly negative. The good ivermectin trials in areas with high Strongyloides prevalence, like Mahmud in Bangladesh, are mostly positive. Worms can’t explain the viral positivity outcomes (ie PCR), but Dr. Bitterman suggests that once you remove low quality trials and worm-related results, the rest looks like simple publication bias: This is still just a possibility. Maybe I’m over-focusing too hard on a couple positive results and this will all turn out to be nothing. Or who knows, maybe ivermectin does work against COVID a little - although it would have to be very little, fading to not at all in temperate worm-free countries. But this theory feels right to me. It feels right to me because it’s the most troll-ish possible solution. Everybody was wrong! The people who called it a miracle drug against COVID were wrong. The people who dismissed all the studies because they F@#king Love Science were wrong. Ivmmeta.com was wrong. Gideon Meyerowitz-Katz was…well, he was right, actually, I got the worm-related meta-analysis graphic above from his Twitter timeline. Still, an excellent troll. Also, the best part is that I ignorantly asked, in my description of Mahmud et al above: And it was! It was a fluke! A literal, physical, fluke! For my whole life, God has been placing terrible puns in my path to irritate me, and this would be the worst one ever! So it has to be true! The Scientific Takeaway About ten years ago, when the replication crisis started, we learned a certain set of tools for examining studies. Check for selection bias. Distrust “adjusting for confounders”. Check for p-hacking and forking paths. Make teams preregister their analyses. Do forest plots to find publication bias. Stop accepting p-values of 0.049. Wait for replications. Trust reviews and meta-analyses, instead of individual small studies. These were good tools. Having them was infinitely better than not having them. But even in 2014, I was writing about how many bad studies seemed to slip through the cracks even when we pushed this toolbox to its limits. We needed new tools. I think the methods that Meyerowitz-Katz, Sheldrake, Heathers, Brown, Lawrence and others brought to the limelight this year are some of the new tools we were waiting for. Part of this new toolset is to check for fraud. About 10 - 15% of the seemingly-good studies on ivermectin ended up extremely suspicious for fraud. Elgazzar, Carvallo, Niaee, Cadegiani, Samaha. There are ways to check for this even when you don’t have the raw data. Like: The Carlisle-Stouffer-Fisher method: Check some large group of comparisons, usually the Table 1 of an RCT where they compare the demographic characteristics of the control and experimental groups, for reasonable p-values. Real data will have p-values all over the map; one in every ten comparisons will have a p-value of 0.1 or less. Fakers seem bad at this and usually give everything a nice safe p-value like 0.8 or 0.9.
Source. Real data would follow something like a bell curve. This is going to require a social norm of always sharing data. Even better, journals should require the raw data before they publish anything, and should make it available on their website. People are going to fight hard against this, partly because it’s annoying and partly because of (imho exaggerated) patient privacy related concerns. Somebody’s going to try make some kind of gated thing where you have to prove you have a PhD and a “legitimate cause” before you can access the data, and that person should be fought tooth and nail (some of the “data detectives” who figured out the ivermectin study didn’t have advanced degrees). I want a world where “I did a study, but I can’t show you the data” should be taken as seriously as “I determined P = NP, but I can’t show you the proof.” The second reason I think this, aside from checking for fraud, is checking for mistakes. I have no proof this was involved in ivermectin in particular. But I’ve been surprised how often it comes up when I talk to scientists. Someone in their field got a shocking result, everyone looked over the study really hard and couldn’t find any methodological problems, there’s no evidence of fraud, so do you accept it? A lot of times instead I hear people say “I assume they made a coding error”. I believe them, because I have made a bunch of stupid errors. Sometimes you make the errors for me - an early draft of this post of mine stated that there was an strong positive effect of assortative mating on autism, but when I double-checked it was entirely due to some idiot who filled out the survey and claimed to have 99999 autistic children. In this very essay, I almost said that a set of ivermectin studies showed a positive result because I was reading the number for whether two lists were correlated rather than whether a paired-samples t-test on the lists was significant. I think lots of studies make these kinds of errors. But even if it’s only 1%, these will make up much more than 1% of published studies, and much more than 1% of important ground-breaking published studies, because correct studies can only prove true things, but false studies can prove arbitrarily interesting hypotheses (did you know there was an increase in the suicide rate on days that Donald Trump tweeted?!?) and those are the ones that will get published and become famous. So if the lesson of the original replication crisis was “read the methodology” and “read the preregistration document”, this year’s lesson is “read the raw data”. Which is a bit more of an ask. Especially since most studies don’t make it available. The Sociological Takeaway I’ve been thinking about this one a lot too. Ivermectin supporters were really wrong. I enjoy the idea of a cosmic joke where ivermectin sort of works in some senses in some areas. But the things people were claiming - that ivermectin has a 100% success rate, that you don’t need to take the vaccine because you can just take ivermectin instead, etc - have been untenable not just since the big negative trials came out this summer, but even by the standards of the early positive trials. Mahmud et al was big and positive and exciting, but it showed that ivermectin patients recovered in about 7 days on average instead of 9. I think the conventional wisdom - that the most extreme ivermectin supporters were mostly gullible rubes who were bamboozled by pseudoscience - was basically accurate. Mainstream medicine has reacted with slogans like “believe Science”. I don’t know if those kinds of slogans ever help, but they’re especially unhelpful here. A quick look at ivermectin supporters shows their problem is they believed Science too much. @jonno_bosch I work in hospitality so I need things to return to normal ASAP. I am using Ivermectin as a prophylactic. Hugely influenced by Carvallo trail and Chala trail which showed huge protection","username":"Bannisterious","name":"Andrew Bannister","profile_image_url":"","date":"Fri Feb 12 16:21:14 +0000 2021","photos":[],"quoted_tweet":{},"reply_count":0,"retweet_count":0,"like_count":0,"impression_count":0,"expanded_url":{},"video_url":null,"belowTheFold":true}" data-component-name="Twitter2ToDOM"> @mtskullcrusher @HereComeTheJud @therealjosexy @joeycadre @PeegeRiley @dcwickedestcity @blaireerskine Read Raad. Or Mahmud. Or ICON study from Florida. Or Mexico City hospitalizations study. Or Niaee. Or...\n\nOr just type \"ivermectin covid\" in Google Scholar and read.","username":"fatlas6","name":"fatlas","profile_image_url":"","date":"Thu Sep 02 21:34:59 +0000 2021","photos":[],"quoted_tweet":{},"reply_count":0,"retweet_count":0,"like_count":1,"impression_count":0,"expanded_url":{},"video_url":null,"belowTheFold":true}" data-component-name="Twitter2ToDOM"> They have a very reasonable-sounding belief, which is that if dozens of studies all say a drug works really well, then it probably works really well. When they see dozens of studies saying a drug works really well, and the elites saying “no don’t take it!”, their extremely natural conclusion is that it works really well but the elites are covering it up. Sometimes these people even have a specific theory for why elites are covering up ivermectin, like that pharma companies want you to use more expensive patented drugs instead. This theory is extremely plausible. Pharma companies are always trying to convince people to use expensive patented drugs instead of equally good generic alternatives. Ivermectin believers probably heard about this from the many, many good articles by responsible news outlets, discussing the many, many times pharma companies have tried to trick people into using more expensive patented medications. Like this ACSH article about Nexium. Or my article on esketamine. Given that dozens of studies said a drug worked, and elites continued to deny it worked, and there are well-known times where elites lie about drugs in order to make money, it was an incredibly reasonable inference that this was one of those times. If you have a lot of experience with pharma, you know who lies and who doesn’t, and you know what lies they’re willing to tell and which ones they shrink back from. As far as I know, no reputable scientist has ever come out and said ‘esketamine definitely works better than regular ketamine’. The regulatory system just heavily implied it. I claim that with ivermectin, even the people who don’t usually lie were saying it was ineffective, and they were saying it more directly and decisively than liars usually do. But most people can’t translate Pharma → English fluently enough to know where the space of “things people routinely lie about and nobody worries about it too much” ends. So they incredibly reasonably assume anything could be a lie. And if you don’t know which statements about pharmaceuticals are lies, “the one that has dozens of studies contradicting it” is a pretty good heuristic! If you tell these people to “believe Science”, you will just worsen the problem where they trust dozens of scientific studies done by scientists using the scientific method over the pronouncements of the CDC or whoever. So “believe experts”? That would have been better advice in this case. But the experts have beclowned themselves again and again throughout this pandemic, from the first stirrings of “anyone who worries about coronavirus reaching the US is dog-whistling anti-Chinese racism”, to the Surgeon-General tweeting “Don’t wear a face mask”, to government campaigns focusing entirely on hand-washing (HEPA filters? What are those?) Not only would a recommendation to trust experts be misleading, I don’t even think you could make it work. People would notice how often the experts were wrong, and your public awareness campaign would come to naught. But also: one of the data detectives who exposed some fraudulent ivermectin papers was a medical student, which puts him somewhere between pond scum and hookworms on the Medical Establishment Totem Pole. Some of the people whose studies he helped sink were distinguished Professors of Medicine and heads of Health Institutes. If anyone interprets “trust experts” as “mere medical students must not publicly challenge heads of Health Institutes”, then we’ve accidentally thrown the fundamental principle of science out with the bathwater. But Pierre Kory, spiritual leader of the Ivermectin Jihad, is a distinguished critical care doctor. What heuristic tells us “Medical students should be allowed to publicly challenge heads of Health Institutes” but not “Distinguished critical care doctors should be allowed to publicly challenge the CDC”? Then what about “believe statisticians”? I’ve never heard anyone propose this before, but re-centering the mystique of scientific-expertise in study-analyzers and study-aggregators rather than object-level scientists is…one way you could go, I guess. Statisticians admittedly sort of failed us here: the first several meta-analyses said ivermectin worked. But the statistical process - the idea that studies are raw materials, but it takes skill to turn them into the finished good of scientific knowledge - sort of comes out looking good. If we need to summarize our takeaway in a slogan of exactly two words, one of which is “trust”, you could do worse than this one. (am I secretly suggesting that we make rationality higher status? Maybe, although rationalists did no better here during the early phase of “looks promising so far” than anyone else, and it was researchers digging into the nitty-gritty of the data who really solved this.) Or maybe this is the wrong level on which to think about this. Maybe there isn’t and can’t be a simple heuristic you can teach everyone in school or via a PR campaign which will lead to them having making good health decisions in an adversarial information environment, without having any negative effects anywhere else. But you also don’t want people to make bad health decisions. So what do you do? The Political Takeaway All of this is complicated by the impression many people (including me) have, that ivermectin boosterism and vaccine denialism are closely linked. The ivermectin evidence is complicated. There’s room for doubt. I can maybe see room for doubt on some marginal vaccine-related issues like how seriously to take the occasional reports of myocarditis in teens. But the basic issue - that the vaccine works really well and is incredibly safe for adults - seems beyond question. Yet people keep questioning it. I think it’s important to address ivermectin support on its own terms - as a potentially plausible scientific theory in a debris field of confusing evidence, which should be debated to the usual standards of scientific debate. I’ve tried to do that above. But this picture wouldn’t be complete without acknowledging the overlap with vaccine denial - a segment of people who are completely crazy and wrong and who happen to have fixated on this mildly interesting question as opposed to some other one with even less evidence. I’ve been trying to figure out a model where ivermectin support and vaccine denialism both make visceral sense to me, and here’s what I’ve got: Imagine that in 2025, an alien invasion fleet reaches Earth. But it got hit by a supernova on the way, the spaceships are partly disabled, and they’re only able to conquer some out-of-the-way place - let’s say Australia. There’s a few cycles of conflict and cease-fire, a few cities get nuked, and finally we settle into an uneasy peace. Over the next few years, humanity grudgingly admits the invaders into the world community. They get a seat in the United Nations. We sort of cooperate with them on projects that are important to both sides, like stopping climate change. We still hate them, but only at the level of ordinary international rivalries, like USA/USSR. In 2035, the aliens announce that a quantum memetic plague from the Andromeda Sector has reached Earth. Billions of people will die unless we let them put an immunity-granting cybernetic implant in all humans’ brain. The aliens admit we haven’t always been friends, and honestly they would still like to conquer us someday. But this plague is an ancient enemy of all sentient beings, they dealt with it on their homeworld eons ago, and they want to help us out here. Humans apparently don’t have the ability to detect quantum memetic plagues, but mortality rates for over-65s do seem weirdly high this year, something like 10x worse than a normal flu season. Do you let the aliens put an implant in your brain, or not? If it helps, the aliens look like this. Surely anyone with a brain that size must know what they’re talking about, right? (source) Fine, you don’t have to decide immediately. The brain implants aren’t even ready yet. Some human scientists suggest wearing face masks in the interim. The aliens say no, that will never work, that’s not how you deal with quantum memetic plagues, if you do anything other than wait for the brain implants you’re anti-science idiots who are wasting precious time and will kill millions of people. Human nations try face masks anyway…and they clearly and conspicuously work. The aliens say whatever, we’re still the advanced spacefaring civilization here, maybe it works for humans but that’s not the point, the point is you’ve got to let us put implants in your brains. Some human scientists suggest reopening vital services. The aliens say no, millions will die, this is “mass human sacrifice”, humans apparently must care nothing about their families’ lives. The humans try reopening anyway, and…it goes kind of okay? Maybe the death rate goes up 10% to 20% or so, hard to say? The aliens say whatever, maybe their calculations were off by a few orders of magnitude, the point is, you have to let us put implants in your brain or you’ll all die. Then some human scientists suggest vaccinating against the plague. The aliens say this is idiotic, vaccines originally come from cowpox, even the word “vaccine” comes from Latin vaccus meaning “cow”, are you saying you want cow medicine instead of actual brain implants which alien Science has proven will work? They make lots of cartoons displaying humans who want vaccines as having cow heads, or rolling around in cow poop. Meanwhile, the first few dozen studies show vaccines work great. Many top human leaders, including war heroes from the struggle against the aliens, get vaccines and are seen going out in public, looking healthy and happy. The aliens say that human science is hopelessly flawed because of complicated statistical concepts that inferior life forms like us don’t even have words for. You need to ignore all the studies and meta-analyses showing that vaccines definitely work, and let the aliens give you brain implants instead. So do you let the aliens put an implant in your brain, or not? Obviously you think long and hard before doing this. And obviously this is an extended metaphor for vaccine denialism. So what’s the difference between the metaphor (where you’re presumably anti-implant) and the real world (where you’re presumably pro-vaccine?) For me, it’s a combination of: The aliens are hostile, so I don’t trust them no matter how smart they are
I actually think this might be more of a crux between us than anything about ivermectin itself. The same people behind ivmmeta have put up websites claiming that 19 different substances, including HCQ, testosterone-blockers, the spice curcumin, vitamins A, C, and D, etc, all cure coronavirus with pretty large effect sizes. I think this is because they are using a nonconventional form of statistics which is always going to find positive effects. I understand and respect why they’re doing this - they link eg this article condemning the idea of statistical significance, which makes good points. But you can’t throw it out without having a replacement. I think ivmmeta is trying to pioneer a new way of thinking about science and statistics without p-values, but I think its new way is actually bad and will get positive results almost all the time. I’ve seen a lot of fruitless debate between ivmmeta and doctors, but I wonder if you could have a fruitful debate between them and statisticians.
We encourage the author to at least direct readers to government approved treatments, for which there are several in the author's country, and many more in other countries (including ivermectin). While approved treatments in a specific country may not be as effective (or as inexpensive) as current evidence-based protocols combining multiple treatments, they are better than dismissing everything as "unorthodox". Elimination of COVID-19 is a race against viral evolution. No treatment, vaccine, or intervention is 100% available and effective for all variants — we need to embrace all safe and effective means.
Author seems biased against believing any large effect size. We note that large effect sizes have been seen in several COVID-19 treatments approved by western health authorities, and also that better results may be expected when studies combine multiple effective treaments with complementary mechanisms of action (as physicians that treat COVID-19 early typically do).
I’m going to guess it’s not true, because I’ve become pretty critical of these people’s methodology since doing the ivermectin review. Also, curcumin is a PAIN (pan-assay interference compound, ie a substance with weird chemical properties that make every test seem positive, so if you do chemical tests to see whether it activates eg coronavirus-fighting immune cells, it will always say yes). This means people are always publishing exciting papers about it and alternative medicine people are always getting really enthusiastic about it and suggesting it as the cure for everything (eg depression).
When I reviewed Vitamin D, I said I was about 75% sure it didn’t work against COVID. When I reviewed ivermectin, I said I was about 90% sure.
Another way of looking at this is that I must think there’s a 25% chance Vitamin D works, and a 10% chance ivermectin does. Both substances are generally safe with few side effects. So (as many commenters brought up) there’s a Pascal’s Wager like argument that someone with COVID should take both. The downside is some mild inconvenience and cost (both drugs together probably cost $20 for a week-long course). The upside is a well-below-50% but still pretty substantial probability that they could save my life.
This is simply not true. I've pointed out exactly why this isn't true in the past to you as well and yet you continue to repeat it. I'm not sure why. Anyway, this is just pure ignorance of the relative risk scale and just how few deaths can radically shift that reported scale. The entire difference of mortality effect is only 39 people among a control group of 1984 patients. Assuming a 15.5% prevalence (average prevalence by parasitologic methods of the trials driving the favorable effect), and even assuming a only 5% chance of getting disseminated strongyloids infection due to either immunosuppression (since less half of the patients in the only paper reporting semi-reporting prevalence Rzztmass likes to cite were immunosupprsssed from steroids) or eosinopenia associated with COVID (which happens even without steroids), that already explains ~15.5 deaths, which is already ~40% of the mortality benefit. That absolutely makes a dent. And even then I suspect this is a low estimate. Bottom line: you continue to not appreciate how small number absolute patient event differences can translate into large differences on a relative risk scale.
2: Some good discussion on my Pascalian Medicine post, see especially David Chapman’s tweets and Jay Daigle’s blog. But I do feel like some the responses flirt with assuming everything has the most convenient possible value to fit a morality tale. Suppose someone you love gets COVID, and you have the option to either recommend or disrecommend that they take a cocktail of melatonin (a harmless sleep supplement, I take it every night, eight unreliable studies have shown it treats COVID), curcumin (a harmless-when-sourced-correctly spice, six unreliable studies have shown it treats COVID) and Vitamin D (a harmless vitamin, twelve unreliable studies have shown it treats COVID). What do you do, here, in the real world? I’m honestly not sure, and I think my discomfort with this question is a lot more interesting than some too-pat fable about The Rationalist Who Thought The Real World Was Exactly Like A Casino.
The 2005 federal budget had $2.5 Trillion in expenditures, increasing to $4.4 Trillion in 2019, with a sharp jump to $6.6 Trillion in 2020 thanks to COVID (source). It's immediately clear that regardless of valuation method, America's total land values ($24-44T) are significantly higher than the annual federal budget. But we care about land rents, not land values. It's not like the plan is to sell off all of America's land just to pay for a few years' spending.
How Much Money Can We Raise from Land Rents? America's annual land rents are sufficient to cover between 18%-40% (Fed) and 34-78% (Smith) of annual federal spending. The low-end figures come from 2020, which was a major outlier in federal spending thanks to COVID.
[Professor Ed] Mills, who thinks that fluvoxamine and budesonide are both appropriate to prescribe to patients sick with Covid-19, compares public messaging on fluvoxamine to communications about Merck’s drug molnupiravir. The evidence for molnupiravir is in many ways weaker than the evidence for fluvoxamine, but molnupiravir was produced by a major pharmaceutical company that can shepherd it through the process of becoming a recommended drug. On a call last week, Mills said, the FDA told him “they don’t know how to deal with submissions where there isn’t someone to be responsible for it.”
Here’s my pitch for fluvoxamine (Luvox) for COVID.
In the midst of all the hype about ivermectin and hydroxychloroquine, scientists put together the giant 4,000-person TOGETHER trial, intended to test all these exciting COVID early treatments. You know what happened next: ivermectin and hydroxychloroquine crashed and burned.
1DaySooner and Rethink Priorities, $17,500, to research public attitudes around human challenge trials. Human challenge trials are studies where scientists deliberately try to infect volunteers with a disease to see if a treatment can prevent or cure it. They're much faster than waiting for people to get the disease naturally, and could have significantly shortened the wait for coronavirus vaccines. But they're controversial and nobody was able to get approval to do a challenge trial for COVID until 2021, which is why we had to wait so long for good treatment. Preliminary research suggests lots of people support these trials; I think building common knowledge of this is a first step towards making them available during future pandemics. Rethink Priorities is a respected effective altruist research organization. 1Day Sooner is a group lobbying for challenge trials. They’re currently seeking $10 million to use challenge studies to develop a universal coronavirus vaccine. Email josh@1daysooner.org if you can help
Alex Hoekstra, $100,000, for the Rapid Deployment Vaccine Collaborative (RaDVaC) to make open-source modular affordable vaccines. They've made a coronavirus vaccine which about fifty people (mostly scientists and biohackers) have self-administered, though there's no hard data on whether or not it works. They don't have regulatory agency approval for anything and probably won't get it, and they cannot sell their vaccine - the only way to get it is to manufacture it in your lab (or home lab) from the blueprints they make available. So what's the pitch for them being useful? First, global inaccessibility of vaccines has been a problem in past and present pandemics and will probably continue; RadVaC thinks their open source model might “drive up vaccine access, diversity, and security in the future”. Second, if there's ever a pandemic much worse than COVID - super-Ebola or whatever - I'm not waiting nine months for the FDA to have the right number of meetings, neither is anyone else, and I think we’ll all be grateful if we previously built the capacity to have a vaccine production group that moves fast and breaks things. Third, I think it's possible that their comparative freedom lets them come up with something genuinely better than Big Pharma, at which point hopefully it will encourage or embarrass Big Pharma into stealing it (did you know RaDVaC offers nasal spray coronavirus vaccines?) Fourth, I think it has positive...let's say "moral"...effects for people to know that ordinary people can do the same things big corporations do, and that it's possible (and sometimes even legal) to innovate without getting anyone's permission first. RaDVaC still needs more funding (go here to donate) and are looking for collaborators with experience in open-source development (RaDVaC wants to build infrastructure for decentralized vaccine R&D, including: construction of standards for sourcing, production, & testing; data-sharing platforms; and other online & accessible scientific tools). Reach out to them here. You can read more about RaDVaC's work here, here, here, here, and here, and find their YouTube channel here.
Alfonso Escudero, $75,000, to create a platform for scientific collaborations. Alfonso and his team already made something like this for COVID research, which got 40,000 scientists to sign up, matched collaborator requests to experts willing to help, and resulted in some useful papers. Now they want to expand this model to other types of science. My father has been stalled on an important research project for years for lack of the right kind of statistician; Crowdfight (or whatever the final name turns out to be) aims to take requests like this and process them within 72 hours. I regret only being able to fund this at the minimum level, but I'm pretty sure that once they're up and running they'll be able to prove their value to richer people's satisfaction. You can also contribute by donating, by joining their community (if you want to be matched with scientists who might need your expertise) or, if you’re a professional scientist, by using their service to find a collaborator (it's free).
11: Why COVID variants skipped from Mu to Omicron: “In a statement, the WHO said it skipped Nu for clarity and Xi to avoid causing offense generally.” Rolling my eyes at “offense generally” and the idea of deliberately averting nominative determinism.
31: The Vitamin D / COVID debate continues, with a recent meta-analysis finding no effect but a Phase II trial of a patented formulation seeming to be successful. I have to admit I’ve kind of clocked out at this point and have no strong opinion on recent developments.
32: Nate Silver, Tyler Cowen, and Garrett Jones come out in favor of “the public health establishment deliberately delayed the COVID vaccine by a month so it wouldn’t make Trump look good before Election Day”. I haven't checked if it’s plausible that public health officials had political motives, but the fact is they made a deliberate decision to make the process take an extra month, and that some four-to-five-digit number of people died because of this decision. Even if we conclude they made this decision for less sinister reasons (like being over-cautious), it deserves to be scrutinized with the same rigor as other decisions that have killed this many people, like the decision to ignore intelligence warnings about 9-11.
There’s a debate over whether Don’t Look Up is supposed to be pushing the progressive line on climate change vs. the progressive line on COVID. I’m not sure it can honestly push either.
But apply it to COVID, and it’s even worse. Dr. Fauci and the CDC tell me every day that Pfizer’s vaccine is safe - but Male Scientist and NASA told their victims every day that Tech Company’s comet retrieval plan was safe. Sounds like we can’t trust scientific authorities when there might be a profit motive involved, better skip the jab! I hear ivermectin looks promising…
Second, a story that comes out of the Creationism Wars of the early 00s. We are the “reality-based community”, the sane people, the normal people, the people with college degrees and non-spittle-covered keyboards. They are unwashed uneducated lunatics who think that evolution is a lie and Obama was born in Kenya and vaccines cause autism and COVID isn’t real. Maybe they should have been clued in by the fact that 100% of smart people and institutions are on our side, and they are just a couple of weirdos who don’t even agree with each other consistently. If this narrative has a movie, it must be Idiocracy - though a runner up might be Behind the Curve, the documentary about flat-earthers.
Then COVID hit. We switched our dates to a Minecraft virtual world, where we built a house together. At the time, I completely missed the kabbalistic significance of this.
Micromarriages come from this post by Chris Olah. They’re a riff on micromorts, a one-in-a-million chance of dying. Risk analysts use micromorts to compare how dangerous different things are: scuba diving is 5 micromorts per dive; COVID is 2,500 micromorts per infection; climbing Mt. Everest is 30,000 micromorts per attempt. So by analogy, micromarriages are a one in a million chance of getting married. Maybe going to a party gets you 500 micromarriages, and signing up for a really good dating site gives you 10,000. If there’s a Mt. Everest equivalent, I don’t know about it.
COVID 23. Fewer than 10K daily average official COVID cases in US in December 2021: 30% 24. Fewer than 50K daily average COVID cases worldwide in December 2021: 1% 25. Greater than 66% of US population vaccinated against COVID: 50% 26. India's official case count is higher than US: 50% 27. Vitamin D is not generally recognized (eg NICE, UpToDate) as effective COVID treatment: 70% 28. Something else not currently used becomes first-line treatment for COVID: 40% 29. Some new variant not currently known is greater than 25% of cases: 50% 30. Some new variant where no existing vaccine is more than 50% effective: 40% 31. US approves AstraZeneca vaccine: 20% 32. Most people I see in the local grocery store aren't wearing a mask: 60%
YGLESIAS PREDICTIONS 1. Democrats lose both houses of Congress (90%) HOLD 2. Democrats lose at least two Senate seats (80%) HOLD 3. Democrats lose fewer than six Senate seats (80%) HOLD 4. Nancy Pelosi announces retirement plans (70%) HOLD 5. Stephen Breyer does not retire (60%) N/A 6. Some version of Build Back Better passes (60%) HOLD 7. Joe Biden is still president (90%) HOLD 8. At least one Biden cabinet-rank official resigns (70%) HOLD 9. No military conflict between the PRC and Taiwan (a worryingly low 90%) HOLD 10. New U.S. sanctions on Russia (70%) HOLD 11. Saudi Arabia and Israel establish diplomatic relations (60%) SELL to 50% 12. Fewer U.S. Covid deaths in 2022 than in 2021 (80%) BUY to 90% 13. Emmanuel Macron re-elected (60%) HOLD 14. Traffic light coalition exploits loopholes to get around the constitutional debt brake (70%) HOLD 15. No recession in 2021 (90%) SELL to 80% 16. Liz Cheney loses primary (80%) HOLD 17. Some version of USICA passes Congress (70%) HOLD 18. Lula elected president of Brazil (60%) SELL to 50% 19. China officially abandons Covid Zero (70%) HOLD 20. Fewer U.S. Covid-19 deaths in 2022 than in 2020 (80%) BUY to 90% 21. Additional booster shots of mRNA vaccines authorized for seniors (80%) HOLD 22. November 2022 year-on-year CPI growth is below 6% (70%) BUY to 80% 23. November 2022 year-on-year CPI growth is above 4% (70%) SELL to 50% 24. The Fed ends up doing more than its currently forecast three interest rate hikes (60%) HOLD 25. Russia does not invade Ukraine (60%) HOLD 26. Viktor Orbán loses power in Hungary (60%) HOLD 27. Sinn Fein becomes the largest party in the Northern Ireland assembly (60%) HOLD 28. The U.S. and Canada reach an agreement on softwood lumber (70%) HOLD 29. Democrats go down at least one governor on net (60%) HOLD 30. The unemployment rate stays between 4 and 5% (70%) SELL to 60% if you mean 12/22, to 40% if you mean it never gets outside that range at all
COVID 22. Fewer than 10K daily average official COVID cases in US in December 2022: 20% 23. Fewer than 50K daily average COVID cases worldwide in December 2022: 1% 24. >66% US population fully vaccinated (by current standards) against COVID: 70% 25. India's official case count is higher than US: 5% 26. Medical establishment reverses course and officially says any of Vitamin D, HCQ, or ivermectin is actually effective against COVID: 1% 27: FDA approves a COVID indication for fluvoxamine: 60% 28. Some new variant not currently known is greater than 25% of cases: 60% 29. Most people I see in the local grocery store 12/31/22 are wearing masks: 60% 30. Masks still required on domestic flights: 60% 31. CDC recommends that triple-vaxxed people get at least one more vax: 70% 32. China has fewer than 100,000 COVID cases this year (official estimate): 30%
These next two sections are based on Vox’s 22 Predictions For 2022 and and Matt Yglesias’ predictions in his Predictions Are Hard post. In both cases, inspired by Zvi, I’ve given the original predictor’s estimate, then either stuck with it, or bought/sold to some other level. This is kind of unfair, because I get to see the original predictor’s thoughts and they don’t get to see mine - also, I’m a few weeks later than they are, and in a few cases that gives me extra knowledge. So:
#107: RADVAC: Open Source Vaccines RaDVaC is a non-profit organization working to maximize access to vaccines when & where most needed: the first days of a disease outbreak. Since March 2020 we have developed & published 12 coronavirus vaccine designs under an open-access license, and worked to catalyze vaccine development globally. We envision a renaissance in vaccinology that is -- Diverse & Decentralized: enabling a diverse, distributed participation in vaccine R&D through lower tech barriers to vaccine design & development -- Transparent: fostering broad, open access to R&D, tools, and data (with less opportunity for distrust) -- Collaborative: optimizes vaccine formulations and their immunological & epidemiological relevance using pooled research, standards, and data -- Resilient: tech platforms that are easily modifiable to adapt to new variants, and centered on conserved domains for durable, mutation-resistant utility -- Rapid: able to deploy high-quality vaccines, at scale, at the earliest days after an infectious disease is identified -- These goals are achievable through the proliferation of open, accessible, & adaptable technologies in vaccine design & production, and improved vaccine trialing models (RaDVaC is developing a novel challenge trial model for safer, faster, lower cost clinical trials). Funds will be used for additional staff and hours, research supplies and services, and cultivating an ecosystem of vaccine developers around the world. More information about supporting the project can be found at https://radvac.org/support/.
#89: A Wiki For Rebuilding Civilization After Disaster My name is Jehan, I've created the site Wikiciv.org as a guide to rebuilding civilization in case of global catastrophe. Its editing is crowdsourced like Wikipedia because a project this large is far too much for one person, or even a team. Technologies and raw materials are linked so both upstream and downstream technologies are easily accessible. There are other projects with similar goals, but they are 1) Not publicly accessible 2) The wrong scale. Books such as "The Knowledge" and "How to Invent Everything" are too cursory to be a practical guide for recreating critical technologies like steel, fertilizer and antibiotics. Meanwhile the "Manual for Civilization" from the Long Now Foundation is 3500 paper books in one corner of San Franciso. Wikiciv fully open and available for database downloads. Distributed backups are encouraged to ensure resiliency during a disaster. WikiCiv could be be helpful even for regional supply-chain disruptions. For example during the Covid-19 pandemic, there were critical oxygen shortages in India. It turns out that a reasonable oxygen generator can be made from zeolite and an air compressor. Wikiciv aims to be a single, interconnected database of "from scratch" manufacturing instructions for situations like these. It is the eventual goal of Wikiciv to be accepted as a Wikimedia Foundation project (like Wikipedia, Wikiquote, Wikivoyage etc). The better Wikiciv becomes, the more likely this is. Get in touch at admin@wikiciv.org
#106: Undercover Hospital Boss Program If everyone who worked in hospitals had to spend a night in theirs as a pretend patient every six months, the experience ought to get much better fast. Imagine a place optimized for healing, rest, calm, and happiness and you'd be hard-pressed to name anything you'd imagined that's present in most hospitals. Yet the people who can bring the vision and reality closer together often are blinded, or blind themselves, to what's happening in their places of work. To get started: Create a pilot program in one department of one hospital. Start with the top administrators. Don't proceed until COVID-19 isn't a significant risk for the program. And, to start gently, everyone knows the "patient" is really a boss. Then they report back to everyone what they experienced and saw. Budget is for an outside consultant to design and run the program, record impressions, facilitate discussion, and outline possible expansion of the program. The work will be in finding the hospital department and consultant. The budget will be to pay for the consultant and some amount for the hospital's time and bed. hospitalmysteryshopper@protonmail.com
Same idea, only more tenuous. We know someone will do an autopsy on Taylor Hawkins soon, and we probably trust it. But how do we figure out whether COVID originated in a lab? This question’s hack is to ask whether two public health agencies will claim it. If we trust the public health agencies, we can turn this mysterious past event into a forecasting question.
But this is a strong ask. Even if we don’t specifically distrust the agencies, this question is a combination of “did COVID originate in a lab?” and “how likely are public health agencies to claim this?”. I expect the question would have a different prediction if it asked about “one public health agency” or “five public health agencies” or “China’s public health agency” or “the public health agency during a hypothetical second Trump administration” or “before the end of 2030”. All of that means we can’t interpret the prediction literally as being about whether COVID originated in a lab.
15: Ivermectin updates: the big Brazilian study that showed ivermectin doesn’t work was officially released. This doesn’t update my analysis because I had included a preliminary version of it. See Gideon Meyerowitz-Katz’s take on some objections here. Another big study from Malaysia also came out; the headline result is “doesn’t work” but Meyerowitz-Katz thinks it’s more complicated (although still leans negative). Avi Bitterman et al formally published their “ivermectin efficacy only in areas with parasitic worms” paper in JAMA. Alexandros Marinos still thinks it works.
16: Related: very large and impressive RCT shows no effect for Vitamin D on COVID.
The Spanish RCT studying vitamin D for COVID used a dosage regimen that - according to Chris Masterjohn's summary - was "equivalent to 106,400 IU vitamin D on day one, 53,200 IU on days three and seven, and 53,200 IU weekly thereafter." [Maybe the high doses explain why it found positive results unlike all the other studies].
The Spanish RCT studying vitamin D for COVID used a dosage regimen that - according to Chris Masterjohn's summary - was "equivalent to 106,400 IU vitamin D on day 1, 53,200 IU on days 3 and 7, and 53,200 IU weekly thereafter." Some of these are heroic doses, and the dosage regimen hardly seems optimal, but this is for people who had already been hospitalized with COVID, a situation of acute illness where the body might be churning through a tremendous amount of vitamin D. (For similar reasons I've started taking vitamin C megadoses when I get sick, because several grams per day could easily make a big difference even though the much smaller doses in RCTs don't.) On the exercise scale, this is equivalent to walking a double marathon on the first day, a single marathon on days 3 and 7, and weekly thereafter.
I’ve said many times that (to a first approximation) Vitamin D is a boring bone-related chemical. Most claims that it does exciting things outside of bones - cure COVID! prevent cancer! decrease cardiovascular risk! - are hype, and have failed to stand up to replication.
As to total censorship: the internet is routing around. The outbreak of Covid is a classic example. The news got out, really fast. Much faster than the authorities wanted. And they cracked down later, notoriously jailing the doctor who broke the story. But the story still got out. That was basically unthinkable under Hu. (Example: the city where I live, Xiamen, was the site of one of China's few successful environmental protests, back in about 2007. I watched them march in the streets to stop a chemical plant being built near our city center. That news never got out - never reached other people, never got into the media.)
In his spare time, Zink is the author of books including Signs: You’re In San Diego (a book of photos of San Diego signs), a book of COVID-19 memes, and educational books for children.
He does have a wacky perspective on the COVID-19 pandemic:
We are now finding out that the COVID-19 pandemic is part of a global plan being orchestrated by the World Economic Forum, headed by Klaus Schwab. Read his book, published in August 2020, “COVID-19: The Great Reset.”
90% chance of fewer than 400,000 cases. 95% chance of fewer than 2.2 million cases. 98% chance of fewer than 500 million cases. This is encouraging, but a 2% chance of >500 million cases (there have been about 500 million recorded COVID infections total) is still very bad. Does Metaculus say this because it’s true, or because there will always be a few crazy people entering very large numbers without modeling anything carefully? I’m not sure. How would you test that? Warcasting The war in Ukraine has shifted into a new phase, with Russia concentrating in Donetsk and Luhansk, and finally beginning to make good use of its artillery advantage. I’m going to stop following the old Kiev-centric set of questions and replace them with more appropriate ones: Notice that this continues to rise, from 16% a month ago to 22% today. See Eikonal’s comment here for some discussion of how this might happen and what territories these might be (and note that we switched from Ukrainian control in the last question to Russian control in this one). I’m keeping this one in here, but it never changes. Meanwhile, on Insight Prediction: $2000 in liquidity and still 14% off from Metaculus, weird. Musk Vs. Marcus Elon Musk recently said he thought we might have AGI before 2029, and Gary Marcus said we wouldn’t and offered to bet on it. It’s an important tradition of AGI discussions that nobody can ever agree on a definition of it and it has to be re-invented every time the topic comes up. Marcus proposed five different things he thought an AI couldn’t do before 2029, such that if it does them, he admits he was wrong and Musk wins the bet (which purely hypothetical at this point; Musk hasn’t responded). The AI would have to do at least three of: Read a novel and answer complicated questions about eg the themes (existing language models can do this with pre-digested novels, eg LAMDA talking about Les Miserables here - I think Marcus means you have to give it a new novel that it has no corpus of humans ever having discussed before, and make it do the work itself).
5: The Atlantic on Why So Many COVID Predictions Were Wrong.
It’s all a competition to see who can signal “I hate Putin” the most, but Germany was still shutting down all its nuclear power plants to rely on Russian gas despite warnings from every other EU state (Russia accounts for 40% of Europe’s gas imports) — so much for grand strategy. That is not to excuse Putin’s invasion (he is, after all, the aggressor) and no, Ukraine is not “the West’s fault” as Mearsheimer has claimed in his viral lecture, but “NATO’s door remains open” for me and “we're going to start WW3 because you're in my sphere of influence” for thee is no grand strategy at all. Indeed, the irrational Western response is not predictable by the unitary actor model, but by the public choice model. Hanania writes: If you were going to cut Russia off from SWIFT, for example, why wouldn’t you announce it beforehand? The whole point of a punishment like that is supposed to be its deterrent effect, but if you don’t communicate that a specific action will happen, then it can’t influence behaviour. The answer here seems to be a lack of grand strategy, with leaders responding to events according to emotion and public relations more than anything. Cutting off SWIFT, or even threatening to do so, seems extreme before an invasion occurs, but not after it has begun. The West cannot rely on sanctions to make Russia abandon its core national security interests, which at the very least include a no-NATO commitment, the acceptance of the secession of Donetsk and Luhansk, and the recognition of the annexation of Crimea. Sanctions will also push Putin closer to Beijing, and the US will continue down the self-defeating path of alienating both of the other two superpowers — so much for American grand strategy. Hanania writes: Even if Putin has maximalist aims at this point, that doesn’t mean sanctions are worth doing. Their costs are high and they may have major consequences for the global economy. One has to consider the possibility that they make Russia more repressive at home and more brutal in its persecution of the war. Putin is getting sanctioned, but ordinary Russians are getting cancelled. The Metropolitan Opera of New York has announced it will no longer stage performers who have supported Russian President Vladimir Putin. Carnegie Hall has done the same, and the Royal Opera House in London is cancelling a planned Bolshoi Ballet residency (one of the oldest and most prestigious ballet companies in the world). Eurovision banned Russia. Tchaikovsky is cancelled. As Tyler Cowen writes, cancel culture against Russians is the new McCarthyism. The culture war has morphed into a hyperreal form on the Internet. Just as COVID is the first pandemic in the Age of Twitter, so the Ukraine invasion is, in some sense, the first war in the Age of Twitter. As it unfolds, we are seeing many disturbing parallels to the events of early 2020. People are rapidly normalising once-fringe ideas like a NATO-enforced no-fly zone (while completely oblivious to the fact that it means shooting down Russian planes and causing WW3), direct US conflict with Russia, regime change in Moscow, and even, incredibly, the use of nuclear weapons. The overnight flips on German defence spending and SWIFT are like the overturning of conventional public health policies on masking and lockdowns. We have entered the age of shitpost diplomacy, as coined by Tanner Green, in which the official Twitter account of the US Embassy in Kiev literally posts memes to spite Putin: A Russian sixth-grader could explain why celebrating the glories of Kievan Rus does not subvert Putin’s claims about the history of the Russian nation so much as reinforce them. Just like Hong Kong’s protests, Ukraine has won the meme war with utterly lopsided propaganda and unanimous international support on the Internet. As Yoshimi writes: Floating ghostlike above it is our war, the myth of the ‘Ghost of Kyiv’, ace MIG-29 pilot who has apparently shot down six Russian planes, or the legend of the Ukrainian soldiers defending an island outpost who replied “Russian warship go fuck yourselves” to a surrender offer and may or may not have died heroically, or two Russian II-76 transport aircraft that maybe were shot down near Kiev, or videos of air strikes or dead bodies which variously are Russian or Ukrainian until they turn out to be from Gaza six years ago, or the viral video of an old Ukrainian woman telling off a Russian soldier by offering him sunflower seeds so when he dies, sunflowers (Ukraine’s national flowers) will sprout from the soil. We’re raising funds for the Ukrainian army on crowdfunding apps and giving advice to the civilians being handed assault weapons about how to disable tanks, sharing weird homophobic pictures of Putin as a gay icon and spamming Russian government posts. Ukrainian president Volodymyr Zelensky has made the decision to stay and fight rather than flee like most would-be leaders who go all in for American foreign policy, and now is being deified by us as “badass”, “a true leader”, etc. etc., alongside his people, whose resistance to authoritarianism we are told is unparalleled in the modern world. After all, so it goes, who could be next? And like in Hong Kong, despite winning the culture war in hyperreality, the actual war in reality is won by the side with overwhelming military might, not morality. The real war is where Ukrainians are experiencing the genuine life-shattering effects of military conflict. It matters because this is the first time Western response is driven by Twitter outcry, and it will not be the last. A New EA Cause? Besides Hanania’s recommendations in the last section (which he admits are more or less impossible in an excellent interview with Caplan), a worthy EA priority might be to somehow turn the public tide on sanctions, which literally kill more people than Putin. Americans should be appalled by the atrocity committed in their names. The banality of the incompetence of foreign policy elites does not excuse their evil. With how entrenched the special interests are, I have no idea if it’s even worth trying, but at the very least the sheer amount of suffering and death from sanctions should be made common knowledge. Nuclear security is one of the top priorities in Effective Altruism, per 80,000 Hours, Future of Life Institute, and Our World In Data. Toby Orb, who wrote the definitive book on existential risk, The Precipice, estimates x-risk from nuclear war to be ~1 in 1000 in the next century. Luisa Rodriguez estimates a 1.1% chance of nuclear war each year and that the chances of a US-Russia nuclear war may be in the ballpark of 0.38% per year; summarised by Max Roser as: Nuclear risk is neglected by the public because of Pax Americana since the collapse of the USSR, and is not discussed as often in EA as it’s thought to be relatively well-funded and mainstream, but in fact major donors like the MacArthur Foundation have been withdrawing funding. As Joan Rohling details in an 80,000 Hours podcast there is much to be done, especially when Ukraine gave up their nuclear arsenal in 1994 in exchange for Russia’s promise to never threaten or use military force against them. A worthwhile adjacent cause area might be de-escalation of public outcry to reduce x-risk from nuclear war beyond just regular anti-proliferation efforts — even a Russian specialist from the RAND Corporation is surprised by how much public outrage is driving policy: Even just the pace of the sanctions: we went to 11 out of 10 in like two days — farther than many expected we’d ever get in short order. And I think the same is true about these military assistance initiatives. We’re just trying to do something because there’s a public demand for action. So that’s what worries me, that the sort of public outrage that’s being channeled in Western democracies through political systems could result in decisions that prove ultimately unwise. Despite how odd it is that some wars are “legal” while others aren’t, we should be glad UNSC exists as much as everyone laughs at how useless the rest of the UN is. All is fair in love and war, but international norms is all that stands between us and nuclear annihilation. It is hard to emphasise just how delusional it is for the public to fixate on no-fly zones — I, like Scott, am surprised we’re still capable of jingoism. 80,000 Hours has updated their top career recommendations to include China specialist to improve China-Western coordination on global catastrophic risk, which seems more important after reading how irrational and captured the American foreign policy apparatus is. As Hanania writes, “great power competition” is an anachronism. If Ukraine is the first war warped by hyperreality, it won’t be the last. Now that US foreign policy elites have driven Putin into the arms of China, let’s hope IR specialists can imbibe the public choice model instead of antagonising yet another nuclear rival. Public Choice Theory and the Illusion of Grand Strategy is an important work because it raises the sanity waterline, which at the least should make us stop killing millions for no reason, and at the most should make the human race more knowledgeable of how to prevent total extinction from nuclear armageddon. Pax Americana is dead, but a multipolar world will be more humane. Endnotes In the fiscal year 2018, the top five government contractors were all weapons manufacturers, with Lockheed Martin in first place at $40.6 billion. The Department of Defence spent $358 billion on contracting, ten times higher than second place Department of Energy. Collective action problems that stop a bunch of smaller companies from effectively influencing policy are no hindrance for companies like Lockheed Martin.
I can tell you [the shoplifting situation is] actually very simple. Almost all data on property crimes is garbage. Most people do not reliably report property crimes of any kind, if you look at the National Crime Victimization Survey you see only about 30% of larceny is ever reported. Shoplifting in particular is almost never reported in large cities, retail workers do not care. They don’t get paid to care, they don’t need a police report for insurance, it takes hours to get a police response, and nothing happens to shoplifters in any large progressive city anyway. I used to ask retail employees in my beat, they would always tell me about rampant shoplifting they simply didn’t bother to report. Changes in reporting caused by Prop 47, COVID and other random factors like police response time will always swamp any actual change in crime. Property crime statistics are worthless. People should believe their own eyes.
Another major theme is the emergence of the Eternal Present. Pseudoevents [10] come and go in rapid succession, everywhere and then nowhere at all. Social media has only accelerated the turnover. The news cycle generates nonstop whiplash. Yesterday it was Covid, today it’s Ukraine; tomorrow both will be memory-holed. Last year’s news has already vanished without a trace. Whither Kazakhstan? Afghanistan? Who knows and who cares?
This is not a fantasy - this is your news feed. The U.S. is predicting a false-flag attack by Russia in the Ukraine. Russia accused the UK of a false-flag attack in Syria. The U.S. accuses China of genocide. China and Iran claimed COVID was a U.S. bio-attack.
Of course people buy into things like Dead Internet Theory. Of course everyone’s flailing about, falling into rabbit holes that get more and more bizarre. Conspiracy theories are modern myths, blooming in the fertile soil of the spectacle. The mainstream news itself is little more than ceaseless conspiracy-mongering at this point. Look at the parade the last few years - Russiagate, Pizzagate, COVID, 2020 election, Jan. 6th… Whatever you might think about those highly controversial topics, many millions of people vehemently disagree with you. They live in an alternate universe. Many millions of other people agree with whatever your stance is - but for reasons so insane and illogical that they also inhabit a totally different reality.
6: The Economist: COVID Learning Loss Is A Total Disaster. I feel awkward here, because I’d previously predicted that Kids Can Recover From Missing Even A Lot Of School, but I don’t think these are quite as contradictory as they seem at first glance. The article mentions that “Data from a few rich countries suggest that schoolchildren in those places are gradually catching up…by last autumn third-graders in Ohio had made back two-thirds of the learning that was found to have been lost by the start of the 2020-21 school year”, which matches my prediction. The problem is that some middle-income countries without good vaccine access kept schools closed really long, took the excuse to cut funding, let students drift away and not return after reopenings, or that “school buildings have decayed…some were looted or damaged during long closures.”
8: After decades of decline, world hunger is rising again, hopefully this is just temporary due to COVID and Ukraine.
33: I used to hope that freedom and tolerance would win in the end because everyone would realize that they were weird and unpopular in some way, and so tolerating weird unpopular people was in everybody’s common interest (cf. “They came for the Communists, but I did not complain…). Since then the world has taken every opportunity to disabuse me of the notion that this could ever possibly work, but I guess it’s still possible to disappoint me. The latest example is /r/forcedbreeding, a fetish subreddit about men enslaving, raping, and forceably impregnating women, which shut down recently to protest Reddit for not censoring pro-Russian subreddits enough. Apparently they’re back up now, but their top stickied post is still a demand that Reddit ban anti-COVID-vaccine subreddits. Another metaphor for life?
The medical consensus about CFS is different from both of these extremes, with the current conception being that there are a number of different pathways into the condition. Some of them are likely triggered by the immunological response to viral infections, with COVID being the most recent example of one of these (hence ‘Long COVID’). But, because not everyone who gets COVID/EBV/whatever develops this syndrome, something else needs to be going on. That something, in the case of COVID, could be a particular immunological profile leading to some ongoing inflammatory response, but this hasn’t been established for other viruses. Not to mention that when people actually try to characterise whatever this immunological response might be, the state of the evidence appears to be ‘cytokines, natural killer cells, T-calls something something’.
1) I’m not certain where other folks are getting semaglutide, but I would imagine Chinese pharma companies. You may have heard of the pharmacy in SLC, UT that attempted to compound hcq for COVID and sell it to the state of Utah? He bought it from a Chinese pharma company and what he eventually got busted for was false shipping manifests, not the blatant violation of the FDCA. I imagine something similar happened here.
"Hello! This is SmartSave pharmacy, home of compassionate savings on drugs, vitamins, and more. Our hours are from 9 AM to 5 PM Mondays through Fridays, 10 AM to 4 PM Saturdays, 10 AM to 3 PM Sundays, 10 AM to 3:30 PM on Christmas, Christmas Eve, Thanksgiving, and New Years, and 9:30 AM to 4:30 PM on Easter, Memorial Day, and Tu B'Shevat. We now have the COVID booster at SmartSave pharmacy! Did you know that COVID boosters can protect against all variants of COVID? Schedule your COVID booster now! We also have the SmartSave card, a great source for all prescription drug savings. ¡Hola! Esta es la farmacia SmartSave, hogar de ahorros compasivos en medicamentos, vitaminas y más. Nuestro horario es de 9 a. m. a 5 p. m. de lunes a viernes, de 10 a. m. a 4 p. m. los sábados, de 10 a. m. a 3 p. m. los domingos, de 10 a. :30 p. m. en Semana Santa, Día de los Caídos y Tu B'Shevat. ¡Ya tenemos el refuerzo COVID en farmacia SmartSave! ¿Sabía que los refuerzos de COVID pueden proteger contra todas las variantes de COVID? ¡Programe su refuerzo COVID ahora! Tambien tenemos la tarjeta SmartSave, una gran fuente para todos los ahorros en medicamentos recetados. If you would like to hear this introductory message again, press 1. Otherwise, stay on the line."
"Hello! This is SmartSave pharmacy, home of compassionate savings on drugs, vitamins, and more. Our hours are from 9 AM to 5 PM Mondays through Fridays, 10 AM to 4 PM Saturdays, 10 AM to 3 PM Sundays, 10 AM to 3:30 PM on Christmas, Thanksgiving, and New Years, and 9:30 AM to 4:30 PM on Easter, Memorial Day, and Tu B'Shevat. We now have the COVID booster at SmartSave pharmacy! Did you know that COVID boosters can protect against all variants of COVID? Schedule your COVID booster now! We now have the SmartSave card, a great source for all prescription drug savings. ¡Hola! Esta es la farmacia SmartSave, hogar de ahorros compasivos en medicamentos, vitaminas y más. Nuestro horario es de 9 a. m. a 5 p. m. de lunes a viernes, de 10 a. m. a 4 p. m. los sábados, de 10 a. m. a 3 p. m. los domingos, de 10 a. :30 p. m. en Semana Santa, Día de los Caídos y Tu B'Shevat. ¡Ya tenemos el refuerzo COVID en farmacia SmartSave! ¿Sabía que los refuerzos de COVID pueden proteger contra todas las variantes de COVID? ¡Programe su refuerzo COVID ahora! Ahora tenemos la tarjeta SmartSave, una gran fuente para todos los ahorros en medicamentos recetados. If you would like to hear this introductory message again, press 1. Otherwise, stay on the line."
"Hello! This is the interruption of the hold music that you will think means someone has finally taken your call, but it's actually just an attempt to advertise to a captive audience! Did you know SmartSave pharmacy is your home base for dealing with the COVID-19 pandemic? We know that in these trying times people are looking for a pharmacy they can trust, and SmartSave is here for you."
In order to find people who were saying this when it wasn’t true, I restricted my Google search to articles from before June 1 2020. Most of the articles I found were from establishment media sources, for example Los Angeles Times’ The Flu Has Killed Far More People Than Coronavirus. So Why All The Frenzy About COVID-19? or Kaiser Health Network’s Something Much Deadlier Than The Wuhan Virus Lurks Near You. These articles were written before COVID had spread very far in the United States, and were right that it had (thus far) killed far fewer people than the flu that year. This was obviously an idiotic way to think about it, and I yelled at them at the time. Still, they weren’t making anything up, just thinking about the (true) relative death counts in a really dumb way.
Last week I wrote The Media Very Rarely Lies. I argued that, although the media is often deceptive and misleading, it very rarely makes up facts. Instead, it focuses on the (true) facts it wants you to think about, and ignores other true facts that contradict them or add context. This is true of establishment media like the New York Times, but also of fringe media like Infowars. All of the “misinformation” out there about COVID, voter fraud, conspiracies, whatever - is mostly people saying true facts in out-of-context misleading ways.
Did you miss out on the past couple years, when there was a ton of straight up lying? Lying about fraudulent votes? Straight up lies about the vaccine killing people? Straight up lies about covid being less dangerous than the flu?
Did anyone in your family (as per your best guess) die of COVID vaccine side effects? I got 917 responses so far. On Kirsch’s original poll, the answers were 3.5% and 7.9%; on my survey, they were 6.8% and 0.9%. I think my higher rate of COVID deaths was because I carelessly changed “household” to “family”, which includes eg extended family. But why did I get so many fewer vaccine deaths? Looking at these people's other responses, they did not show a consistent tendencies to make things up or say outrageous things (except for one who listed their religion as “Satanist”). That having been said, they did have an atypical response pattern; most ACX readers are white male Westerners, but these people were 38% female, 38% nonwhite, and 88% non-American. Highest degree was 12% high school, 25% college grad, and 63% postgrad; IQs were listed as extremely high, just like everyone else who gives their IQs on my survey. Politics were significant for 25% Marxist (otherwise a rarity in my survey), but otherwise typical, and did not lean right-wing. They were slightly, but not overwhelmingly, more likely to distrust the media and dislike strong COVID responses than other survey respondents. Overall I don't feel like I learned too much from examining them. The survey is still open (take it now if you haven’t already!) and I'm hoping to get more data on this later. 5: Comments Pointing Out Very Clear Examples Of Media Lies Several people agreed with the wider point, but tried to find a counterexample - a media lie so explicit that nobody could ever deny it. Some people noted that the term “fake news”, when invented in 2016, was originally applied to a very specific kind of fake article, often from weird Macedonian article mills, that were saying utterly fake stuff in a way that even Infowars didn’t. Robert Stadler: This was what was interesting about the phenomenon of "fake news" during the 2016 election, before that term was successfully hijacked by Donald Trump to mean "news stories I don't like." There was a wave of what looked like news articles, spread largely via Facebook, that were entirely fictitious. The people writing those "articles" were not journalists and were not trying to be journalists. They made up the stories out of a mix of rumor and complete fabrications, either for political purposes or just as click-bait (this has never been entirely clear to me). It's unfortunate that the term "fake news" has been so thoroughly tainted, because the existence of those articles was genuinely noteworthy, and it's now harder to talk about them . . . I don't remember any myself (since it's been 6 years), but here's a study which has some specifics - http://web.stanford.edu/~gentzkow/research/fakenews.pdf After some searching, Benjamin Jest (writes As Fair A Name) was finally able to produce a specific example - Nancy Pelosi Hanged At Gitmo - which does, indeed, claim that leading US Democrat Nancy Pelosi was hanged at Guantanamo Bay for “treason and conspiracy” on December 27, 2022. It seems to suggest that the order was given by Donald Trump, who is still President, and that Hillary Clinton had already been executed in the same manner in April 2021. I will admit this is definitely an example of a “news source” making things up rather than just stretching the truth. The source, RealRawNews, claims on its About Page to be a “parody site”, but this outside article about them says they go back and forth between claiming to be a parody and claiming to be real. Some of their claims are more plausible than the Gitmo one - for example, that many Air Force pilots were resigning because of the COVID vaccine mandate - but equally false. They seem to go back and forth between “things that some conservatives might believe to be true” and “things that are obviously false but maybe gratify conservatives’ id”, adding or subtracting the “parody” label based on which one they’re doing at the time. It’s a fascinating business model, and I guess the term “fake news” fairly applies to it. Yug Gnirob writes: I don't know how to find them, but I definitely remember several completely fake articles about Trump during and immediately after the election. One of them was him citing "an ancient law" that prevented President Obama from doing... some liberal thing, I don't remember what. The most memorable one was immediately after the "Muslim Ban", where they claimed it had resulted in the arrest of a high-priority terrorist on day 1. I feel like that one showed up on one of the fact check sites, but I'm not seeing it on Snopes. I remember Stephen Colbert reporting the articles had been tracked down to a couple of Macedonian teens, who had discovered that writing fabricated pro-Trump articles was an easy way to make money. 6: Comments Making Other Claims Of Media Lies And Misdeeds — Beowulf888 on the LA Times and COVID: Well, there are media outlets that propagandize—but I think it boils down to if it bleeds it leads. Most corporate media outlets have the economic incentive to increase the readership by grabbing one's attention with scary headlines and articles. The perfect example of this phenomenon was in April 2020 when the LA Times interviewed an atmospheric chemist at Scripps. She made the claim that SARS2 virus particles in sewage were being carried back to land by sea spray. The reporters and editors uncritically relayed her comments as if she were an expert with the same credentialled expertise as a virologist or epidemiologist. There are numerous reasons why this would be very very low on the threat level even with what little we knew about the SARS2 virus at that time. This story was picked up by the media everywhere, and county health officials (either because there was public pressure to do so, or because they really believed her) shut down beaches up and down the coast of California. Did the LA Times and the news media really have any motivation to promote the closure of public beaches? I can't imagine they did. But they did have a scary headline that would promote readership and spread LA Times as a news source. Some weeks later the LA Times did a retraction, but by that time it had entered the popular imagination that beaches were a potential vector for COVID infection. I’m developing an allergy to the word “uncritically”. Being able to fact-check scientists is a rare skill - I’m not surprised nobody at the LA Times had it ready to deploy for this exact article. — Mike Mulligan writes: The pushback is largely because you are doing a false equivocation between the New York Times (who you hate and have a vendetta against) and Infowars (who you are pretending does basically the same thing as other outlets). And you know this, but on your own metric it won't count as a lie, because you just selectively misrepresented things. On the two articles in this series, I’ve included phrases like “This doesn’t mean these establishment papers are exactly as bad as Infowars; just that when they do err, it’s by committing a more venial version of the same sin Infowars commits” and “Again, my goal here isn’t to . . . say NYT is exactly as bad as Infowars” and tried to explain the exact way that two things can both commit a similar error without one being exactly as the other (Hitler and someone who shot a robber in self-defense both committed a similar action called “killing people”, but this doesn’t mean they both killed exactly the same people with exactly the same level of justification). Still, I got numerous comments getting angry at me for saying that I was calling NYT exactly as bad as Infowars, and saying I was being deceptive / lying because of this. This is why I’m so convinced people are erring on the side of too mistrustful - you can fill your articles with sentences about how you’re not claiming X, and people will still find ways to accuse you of lying because you said X. — Garrett writes: [The way Infowars covered Obama’s birth certificate] isn't any different from eg. mainstream media coverage of anything which involves firearms. They make (or promulgate) so many stupid technical errors I've stopped paying attention to them at all. They could have 1 person on staff who's responsibility is to understand firearms and run everything past them. But they don't. To what should I attribute this continual stream of errors? Is mainstream media coverage of firearms honestly flawed? Is it “reckless disregard for truth?” Is it a “lie of egregious sloppiness?” I think your answer to this question will depend more on how bad you want to accuse the mainstream media of being, relative to other forms of media, than on how you define these inherently slippery terms. — Jeremy Goldberg writes: There's an outright lie right now on the Washington Post homepage. A caption above a graph showing the inflation rate over time states, "Elevated prices coming down, annualized rate shows." The chart shows the current inflation rate is 7.1 percent, down from a high of around 9 percent. Elevated prices are not coming down at all. They just aren't elevating as fast anymore. I asked Jeremy to guess the probability that this was an honest mistake vs. malice. He said (thanks for giving a clear answer!) 60-40 in favor of malice. I think this is pretty high, given that I had to read Jeremy’s comment several times before I realized what the error was supposed to be, but I’ve already said I lean towards the “all the rest of you are extremely paranoid” side of things. — Jiro writes: I opened a thread on dsl: https://www.datasecretslox.com/index.php/topic,8430.0.html People brought up several examples there. You can read the thread. One of the more famous examples was saying that Kyle Rittenhouse crossed state lines with a weapon. There are also a bunch of cases where the media says there's "no evidence" for something that has evidence. Someone also brought up your own example of people "tested for drugs" when they were actually just asked if they used drugs. I would count that as an outright lie, even though you don't. I disagree that being asked if someone used drugs is a "test". Oh god, if saying there’s “no evidence” for something counts as a lie, then every media source in the country stands hopelessly condemned. I did write an article (here) on what the people who use that phrase might be thinking (if you can call it that). I agree the Rittenhouse situation was pretty egregious, though commenters bring up that since he went across state lines and had a weapon, it wasn’t unreasonable for people to assume he brought the weapon across state lines. Still, you wonder whether news sources would have repeated reasonable-sounding-but-didn’t-actually-check slanders about someone they liked. I do think this is a good antidote to some of the “mainstream media is actually very careful and fact-checks everything in their original reporting” takes in the comments section. — David Riceman says: How about Richard Landes's new book "Can the whole world be wrong?" about the many lies in the cognitive war against Israel (e.g. Muhammad Al Dura) See his discussion here for why he thinks this is a good example. — FractalCycle writes: I'm collecting examples from other people, will post ones that seem like real counterexamples as I get them. Here's one from recently: https://forum.effectivealtruism.org/posts/jsByfxvNA4x23stLY/a-letter-to-the-bulletin-of-atomic-scientists Yes, I included this issue with the Bulletin Of Atomic Scientists in my last links post, and they really do come out looking very bad here. See here for more discussion. — Hank Wilbon (writes Partial Magic) writes: I think the false Rolling Stone story a decade ago about the frat gang rape counts as the media explicitly lying, particularly as Rolling Stone is historically known for good fact checking (It is a plot point in the movie Almost Famous), however I think that counts as a "very rare" case and that Scott's claim is correct. I asked “Why? A woman said she had been raped, and Rolling Stone believed her. The woman was making it up, but Rolling Stone wasn't” and Deepa commented “Isn't it the job of a reporter to investigate? And be good at it?” I don’t want to pick on Deepa, but this is what happens when you have an overly expansive definition of “lie”! — TorontoLLB writes: The most straightforward counterexample I can think of is the NBC manipulation of the George Zimmerman 911 call. For example this: "The 9-1-1 operator then asked: "OK, and this guy, is he black, white or Hispanic?", and Zimmerman answered, "He looks black." was changed to: ""This guy looks like he's up to no good. He looks black." In another segment they combined completely separate parts of the call to create an audio clip that presents him as saying ""This guy looks like he's up to no good or he's on drugs or something. He's got his hand in his waistband, and he's a black male." There was other bits of reporting from the major networks that appear to be closer to fraud than selective amplification or choosing what not to report. Enough so that in Twitter threads asking people how they got "red-pilled" person after person refers to the media response to the incident. I haven’t looked into this and I can’t confirm or deny that this is true. I hope everyone finds at least one of these comments obviously fair, and at least another obviously unfair, in a way that encourages you to think more about these issues. 7: Other Comments — Paul writes: What's funny is the Weekly World News - the supermarket tabloid with headlines declaring Bigfoot had been found, and married to a local man's sister!; JFK was still alive, etc. - would pass muster under this analysis. They always had sources report stories to them. Those sources were just batshit crazy. Their strategy was simply not to question them skeptically to poke holes in their story as an ordinary reporter/person would, but to encourage them - "Wow, really, a wedding; what was Bigfoot wearing?" I don't mean to entirely dismiss the distinction you make. But in insisting that not a single story - not even one of the most egregious stories by the most irresponsible, disreputable, of barely-extant publications - is a lie, I think you try to prove too much. In doing so, you retreat so far that you defend only a weak and emasculated position, not any of the broader or more meaningful points implicated by your piece. Thanks for this - I always wondered what those tabloids thought they were doing, and for some reason this matches my model of human psychology better than my previous theories about “maybe they just made it up” - though I bet they do some of that too. — John Buridan writes: I used to have very low priors against conspiracy theories and so was willing to hear out the arguments at length and go back and forth for many weeks and months on a single theory. I would say my conspiracy theory expertise is in creationism and government conspiracies, especially ones involving either Catholicism or Judaism. And I'm okay on one's involving fluoridation, chemtrails, and GMOs etc. One of my housemates was a senior when I was a freshman in college gave me the Adobe illustrator birth certificate shtick, and we went through it together. We downloaded the birth certificate, uploaded it to Adobe illustrator, and saw the weird things. Then I went back to my day job where I was learning Adobe Illustrator. This is maybe 2 weeks later. And what do I find but that when I do this with any PDF, Illustrator renders it in the same janky way? Conspiracy dissolved. I grew up surrounded by people who believed conspiracy theories, although none of those people were my parents. And I have to say that the fact that so few people know other people who believe conspiracy theories kind of bothers me. It's like their epistemic immune system has never really been at risk of infection. If your mind hasn't been very sick at least sometimes, how can you be sure you've developed decent priors this time? Of course, this just all goes back to the dark matter beliefs of people in our outgroup. And the eternal question of where do good priors come from? How do some people's beliefs get so messed up? Thanks for this. I agree that a little bit of experience personally believing conspiracy theories, or knowing people who do, goes a long way. When I was a teenager, I flirted with a lot of pseudoarchaeology theories - think Graham Hancock, underwater pyramids, that kind of thing. I got better, but it left me with a visceral understanding of how people can genuinely believe weird things - not be lying about it, not be secretly making some kind of emotional point about how they hate the system, not be deliberately trying to be as sloppy as possible because you’re a bad person - just genuinely believe it because you tried to reason about it and failed. I think if you haven’t had that experience, then it’s really hard to understand people who have. 8: My Actual Thoughts I should probably try to say, as clearly as possible, what I think. It seems like all of these are different things: Reasoning well, and getting things right
Comments On Why 8% Of Americans Said They Had Relatives Who Died From The COVID Vaccine
So 72% of people agreed that technically true but deliberately misleading things were lies. Could I have saved myself some trouble if I had titled the post “The Media Very Rarely Says False Things”? Or “The Media Very Rarely Makes Up Facts”? I think people would have been equally annoyed that I was using “false things” or “make up facts” in a way that excludes technically-true-but-misleading statements. Someone’s going to argue I should have gone all the way and titled it “The Media Very Rarely Lies, But This Is True Only In The Most Nitpicky Technical Sense Of The World Lie, And In The Normal Sense Of The Word They Definitely Do” - at which point I will remind you that I absolutely did that, I just put the second part in the subtitle instead of the title. If you can’t bring yourself to read the non-bolded gray text, there’s no helping you. 2: Comments Equating Lying With Egregiously Sloppy Reasoning Other people placed a lot of importance on the specific phrasing in the Infowars birth certificate article where it concluded “therefore, the birth certificate is false”. For example, Bakkot: I'm with you on the general point but I think you're being too charitable to InfoWars (and maybe others) in at least some examples. Take the InfoWars birth certificate one: in addition to all the claims about layers and so on, it says "the document is a shoddily contrived hoax". That is a factual claim which is false. They offer support for that claim which isn't actually convincing, and the support they offer happens to be true but out of context, and I'm with you on calling the supporting evidence "not lies". But "the document is a shoddily contrived hoax" is in fact flatly false, and is asserted by the article itself, not just "someone said". This seems wrongheaded to me. Reposting from my own comment there: when I say "Obama's birth certificate is real and not a forgery", I'm not tapping into the Platonic realm and reading the truth directly. I'm saying that I have seen a lot of evidence that makes me think Obama's birth certificate is real and not a forgery, and have inferred the conclusion "it's real and not a forgery" from that. If later it turned out it was a forgery - say there was some amazingly vast conspiracy theory that I completely missed - I wouldn't have been lying when I said the words "Obama's birth certificate is real and not a forgery". I would have been stating the conclusion I had inferred from my facts (which, in this hypothetical, would have been wrong, because I'm bad at reasoning). Jones states his own facts and the conclusion he infers from them. If his conclusion is wrong, the correct term for this wrongness is "failed inference", not "lying". Bakkot is still not happy with this: I don't think you've made a claim with reckless disregard for the truth, whereas I think InfoWars did. I am not at all convinced that InfoWars had a sincere belief that the birth certificate in question was a forgery. I think it is much more likely that they simply didn't care to know the truth of the matter. And I think it's reasonable to say that when someone makes it a false claim without caring whether or not it's true, that's a lie. This is the standard used for defamation in the US ("reckless disregard for the truth" is stock legal phrase), and defamation is usually understood to mean "lying about someone in a harmful way", so I think this is a pretty normal standard. At this point I acknowledge we’re disputing definitions, but I want to stand by mine. I’ve seen, again and again, that people are incapable of understanding that honest disagreement is possible. For example, I wrote about this here in the context of the millions of liberals who insist that conservatives can’t possibly care about fetuses’ lives and anyone who says they does must just be lying about it in order to justify their real program of oppressing women. When someone says “Joe is a liar”, I don’t want to have to ask every time “Do you mean you have some actual evidence for this, or that they said something you disagree with and you instantly leapt to ‘this is reckless disregard for the truth because nobody could ever be so dumb as to honestly disagree with me’?” I think if we let people use the word “lie” this way, then the overwhelming majority of accusations of lying would be false. Why would we want to define a word in a way that dooms it to constantly be used incorrectly to mislead people? I’m kind of sensitive to this because for almost every article I write, people in the comments accuse me of lying, or “pretending” I don’t know why the statements I made are wrong, or some other offense which I plead innocent to. My prior on “a randomly selected egregiously wrong person is lying” is much lower than the sort of people who make these accusations. I think people are just really paranoid about this, and we should use our terms carefully in a way that mitigates this paranoid rather than inflames it. Some of this might be more convincing after you read Part 6 of this post, where I list commenters’ proposed examples of media lies. Eric Newcomer writes: I hope we can all agree that the NYT wouldn't draw such big conclusions from such thin findings. The InfoWars birth certificate article doesn't even really seem internally certain about how PDF layers work. The critique I'm making now falls into the broader "InfoWars is much more egregious in its infractions than the NYT category." But I do think it reveals the slippery line between knowing lies and what one might call "lies of egregious sloppiness." If some serious part of a person knows that they haven't proved what they're claiming but they (or their bosses) insist on claiming that you have proved it, isn't that a form of lying? I’m sorry, but “lies of egregious sloppiness” sounds to me like “physical violence of egregious emotional violence”. Emotional violence and physical violence are both bad. Physical violence probably sounds worse to most people, and so it’s really tempting to, if emotional violence is really bad, say that that makes it a kind of physical violence. But I think that, although this is tempting, it’s false and you shouldn’t do it. I don’t want to say you’re allowed to sound more confident than you are. If you’re 71% confident, and you falsely say you’re 72% confident, then you are lying. But if you are very dumb, and seeing a random piece of toast makes you 100% confident that Obama’s birth certificate is false, and you vomit some random words to that effect onto a page, then you’re an idiot but not a liar. 3: Comments About Whether Infowars Believes Their Own Claims But I guess if you do want to be careful with the definition of the word “lie”, then it becomes important to know whether people at Infowars honestly believe their conspiracy theories or not. I don’t want to defend super-hard the thesis that they do. I’m not sure. If you forced me to guess, I’d say something like for a randomly given Infowars reporter and a randomly selected conspiracy theory they’re reporting on, 40% of the time they think it’s at least plausible enough that they’re doing good work by reporting on it, 20% of the time they know on some level that it’s false and they’re doing something wrong, and 40% of the time they’re in some kind of weird superposition where it seems emotionally true to them and they feel this hard enough that they never get around to asking whether it’s literally true. I’m really not attached to these numbers, but man are a lot of you attached to the claim that they definitely know their theories are false and are consciously lying. My main argument against this is that millions of people believe conspiracy theories - if they didn’t, we wouldn’t care so much about them! - and why shouldn’t some of those people work at Infowars? It would be quite a weird system for the conspiracy ecosystem to be run by an elite who secretly know they’re false, serving up fables to a base who believe them completely. How would you prevent some of the believers from rising into the elite? It would almost take a conspiracy of its own! Eric Newcomer has a more convincing counterargument than I expected: As an aside, I have personally worked at the NYT newsroom (reporting fellow) and at conservative outlet Washington Examiner. And I found the latter to be much sloppier and less worried about thinking through the impression it gave from facts. The Examiner would headline any big budget deficit number etc on my beat whereas the NYT had very detailed copy editors who would spot factual assertions in my copy that I didn't even consider I was making and push back on them. On InfoWars, it seems naive to presume that the outlet pushing the most misleading stories (InfoWars) is acting in good faith rather than just supplying readers with what they want. I get the point (one that Noam Chomsky has made) that outlets can just hire the bias that they want. But I actually think it's fairly hard to staff up true believers who can write and report credibly for conspiracy and super rightwing type stuff -- hence why a bunch of liberals like myself found themselves out of college writing for the local section of the Washington Examiner before it was killed. I find that on an intuitive level, I’m not too surprised to learn this - most journalists seem liberal, it would make sense that conservative papers couldn’t entirely escape this effect. On a more napkin-math level, I’m boggled - isn’t this embarrassing for the Examiner (and the journalists involved?) Wouldn’t they spend a lot of effort avoiding it? In a country with 100 million conservatives, is it really that hard to find a handful of them capable of writing news articles? There are many people writing okay-quality right-wing Substacks that get like five views per article. Are they doing this for the (nonexistent) money, without believing in the cause? If not, why couldn’t these people have been Washington Examiner reporters? Or InfoWarriors? I think Richard Hanania has a theory that a lot of liberals’ political advantage comes from a culture where they are happy to work themselves ragged for minimal compensation as long as it seems like like an impressive job they won’t be embarrassed to tell their friends and family about - ie intellectual college-degree-requiring labor. Maybe this is what the Examiner is taking advantage of? I don’t know. I don’t know what kind of ethical principles Eric considered when he decided to work for the Examiner, but I bet he wouldn’t have agreed to work for Infowars even if they paid him much more money. This should also factor into our calculations about whether Infowars is being staffed by Eric-equivalents. Human writes: There is at least one former infowars employee who alleges that their stories are (at least often) known to be false. The most clear-cut example I can quickly find is here: "Shortly after Jones began selling the supplements, someone posted a video on YouTube holding a Geiger counter displaying high radiation readings on a beach in Half Moon Bay, Calif. The video went viral, stoking fears that radiation from Fukushima was drifting across the Pacific Ocean. Jones saw an opportunity and sent me, along with a reporter, a writer and another cameraman, to California. We had multiple Geiger counters shipped overnight, unaware of how to read or work them, and drove up the West Coast, frequently stopping to check radiation levels. Other than a small spike in Half Moon Bay — which the California Department of Public Health said was from naturally occurring radioactive materials, not Fukushima — we found nothing. "Jones was furious. We started getting calls from the radio-show producers in the office, warning us to stop posting videos to YouTube stating we weren’t finding elevated levels of radiation. We couldn’t just stop, though; Jones demanded constant real-time content. On some of these calls, I could hear Jones screaming in the background." See also here for a discussion of Jones admitting he was lying about Sandy Hook during the lawsuit. 4: Comments On Why 8% Of Americans Said They Had Relatives Who Died From The COVID Vaccine Many people, including me, were confused by a poll in which 8% of Americans said they had a relative who died from the COVID vaccine. I speculated that maybe they were reading too quickly and misinterpreted it as “a relative who got the COVID vaccine”. But Tytonidaen wrote: I think a more likely explanation is that many people are choosing to attribute deaths to the vaccine that are not actually from the vaccine. For example, let's say Person A gets vaccinated and dies shortly after of some completely unrelated cause. And let's say Person B, the loved one being polled, has priors about vaccines or the medical establishment or whatever that cause them to be convinced it was actually the vaccine that killed Person A. In hypothetical reality, Person A lived a rather unhealthy lifestyle, had lots of risk factors for a heart attack, and would have died from a heart attack, regardless of whether they'd gotten the vaccine. Then, when Person A does, indeed, die of a heart attack, and by sheer coincidence had recently gotten vaccinated, Person B blames the COVID vaccine when polled, but it wasn't really the vaccine that killed their loved one. It might be easier to believe that outside forces (like a vaccine) harmed the person than to believe that the loved one's own actions did (like a poor lifestyle, not taking their meds, etc.). That's only one example, but I think the underlying dynamic could easily explain the poll results. None Of The Above wrote: In general, it seems like when you ask a factual question with partisan/CW valence on a poll, and the respondents don't know much about the factual question, they answer the "whose side are you on" question instead. That is, if you ask Republican-voting biologists, they'll nearly all tell you the Theory of Evolution is basically how living stuff came to be, but if you ask Republican-voting normies whose vaguely-remembered high school biology class may have mentioned Darwin a few times, they'll answer that evolution is a lie--they don't really know one way or another, they're just answering the "whose side are you on" question. Democratic normies will far more often tell you evolution is true, but probably could do little better in explaining why than the Republican normies could in explaining why evolution is really an atheist lie of some kind. Zack wrote: I took a time-boxed peek at the Pollfish data. The 1500 results were splint into 3 batches of 500. I arbitrarily selected the Jul 4 file to look at. In that file, there were 36 respondents who reported a household member had died from he vaccine. Focusing on those responses, I noticed a few interesting patterns. Of those 36 respondents, 10 responded "Yes" to both the question about death of a household member from COVID and death of a household member from the vaccine. I'm skeptical that 10 out of 500 people were unfortunate enough to have 2 household members die: one from COVID and one from the vaccine. (Especially because these are not large households; 4 of these 10 report that they have 1 other household member, and 5 of these 10 report having 2-4 other household members.) Of these 36 respondents, 20 responded "Yes" to both the question about death of a household member from the vaccine and "Are you planning on getting future COVID vaccines?" I'm skeptical that 55% of people who had a household member die of a vaccine would plan to get the vaccine themselves. Of these 36 respondents, there are even 4 who experienced a surprising number of adverse affects from the vaccine (Myocarditis, Pericarditis, AND Bell's Palsy ) requiring hospitalization in addition to having a household member die from the vaccine. Of these 4, 2 selected all of the following: "It will likely shorten my lifespan", "I am now unable to hold a job", "I am now unable to work a full day", "It impacts my personal life", "It is a minor annoyance". Those two are planning to get the vaccine again. There's some overlap between these respondents. Ignoring all of them drops from 36 who had a household member die of the vaccine to 12. I don't see obvious inconsistencies in these responses. However, there seems to be a broader issue with the survey design. They look at average time to complete each question, but average doesn't seem like the right measure here (3 people took 10+ minutes to answer; summed, the fastest 250 responses took about as long as those slowest 3). Of the 500 responses, most people seem to answer 7-10 questions. I timed myself just reading those questions silently in my head (not thinking about the answers). Of three attempts, my fastest was a bit over 17 seconds. 40 people completed the survey in 17 seconds or less. I'm skeptical it's possible for someone to provide a quality response to the survey that quickly. 225 people (nearly half) completed the survey in less than 31 seconds. I think that's the fastest I could answer if I were seeing the questions for the first time. It seems like Pollfish's model may encourage hasty, poor quality responses; "Pollfish uses non-monetary incentives like an extra life in a game or access to premium content." (https://resources.pollfish.com/pollfish-school/how-the-pollfish-methodology-works/) It seems like that creates a misalignment of incentives; the respondent is in a hurry to get back to whatever they were doing. They provide survey fraud protection, and claim it filters suspiciously quick or suspiciously consistent answers (e.g., the same answer for all questions), but it seems to be overlooking obviously problematic responses in this case. (https://resources.pollfish.com/pollfish-school/how-pollfish-prevents-fraudulent-responses/) This bothered me enough that I emergency-edited the ACX Survey partway through to include (slightly differently phrased variants of) the two questions on the poll: Did anyone in your family (as per your best guess) die of COVID?
Gideon (correctly) phrased this as a non-sinister albeit potentially weird misstep by the study authors, but in trying to summarize Gideon, I (incorrectly) phrased it as a sinister attempt to inflate results. After looking into it, I think Alexandros is completely right and I was completely wrong. Although I sometimes get details wrong, this one was especially disappointing because I incorrectly tarnished the reputation of Biber et al and implicitly accused them of bad scientific practices, which they were not doing. I believed I was relaying an accusation by Gideon (who I trust), but I was wrong and he was not accusing them of that. I apologize to Biber et al, my readers, and everyone else involved in this. My only reservation is that I don’t want to say too strongly that Gideon’s critique is wrong: I haven’t looked through the study documents enough to say with certainty that Alexandros’ reanalysis of the protocol issues is correct (though the superficial check I’ve done looks that way). But my mistakes are completely separate from anything Gideon did and definitely real and egregious. Cadegiani et al (Alexandros 50% right) Flavio Cadegiani did several studies on ivermectin in Brazil; I edited this section in response to criticism by Marinos and others, but the earliest version I can find on archive.is (I can’t guarantee it was the first I wrote) said: A crazy person decided to put his patients on every weird medication he could think of, and 585 subjects ended up on a combination of ivermectin, hydroxychloroquine, azithromycin, and nitazoxanide, with dutasteride and spironolactone "optionally offered" and vitamin D, vitamin C, zinc, apixaban, rivaraxoban, enoxaparin, and glucocorticoids "added according to clinical judgment". There was no control group, but the author helpfully designated some random patients in his area as a sort-of-control, and then synthetically generated a second control group based on “a precise estimative based on a thorough and structured review of articles indexed in PubMed and MEDLINE and statements by official government agencies and specific medical societies”. Patients in the experimental group were twice as likely to recover (p < 0.0001), had negative PCR after 14 vs. 21 days, and had 0 vs. 27 hospitalizations. Speaking of low p-values, some people did fraud-detection tests on another of Cadegiani’s COVID-19 studies and got values like p < 8.24E-11 in favor of it being fraudulent. Also in Cadegiani news: he apparently has the record for completing one of the fastest PhDs in Brazilian history (7 months), he was involved in a weird scandal where the Brazilian government tried to create a COVID recommendation app but it just recommended ivermectin to everybody regardless of what input it got, and he describes himself as: …the only author of the sole book in Overtraining Syndrome, the prevailing sport-related disease among amateur and professional athletes. He is also responsible for approximately 70% of the articles published in the field in the world in the last 05 years, and reviewer for more than 90% of the manuscripts in the field. And, uh, he’s also studied whether ultra-high-dose antiandrogens treated COVID, and found that they did, cutting mortality by 92% . Which sounds great, except that it looks like most of this is that the control group had a shockingly high mortality rate, much higher than makes sense even in the context of severe COVID. I think the charitable explanation here is that he made this data up too. But the Brazilian Parliament seems to be going with an uncharitable explanation, seeing as they have recommended that Cadegiani be charged with crimes against humanity. Anyway, let’s not base anything important on the results of this study. You can find Alexandros’ full critique here, but again I’ll try to summarize it as best I can. Alexandros is unhappy with my portrayal of Cadegiani’s background. I cite details that make him look strange and maybe fake, but there are other details that make him seem more impressive, like that he won gold medals at a Brazilian Scientific Olympiad.
In November 2021, I posted Ivermectin: Much More Than You Wanted To Know, where I tried to wade through the controversy on potential-COVID-drug ivermectin. Most studies of ivermectin to that point had found significant positive effects, sometimes very strong effects, but a few very big and well-regarded studies were negative, and the consensus of top academics and doctors was that it didn’t work. I wanted to figure out what was going on.
I thought the most plausible explanation for the discrepancy was Dr. Avi Bitterman’s hypothesis (now written up here) that ivermectin worked for its official indication of treating parasitic worms. COVID is frequently treated with steroids, steroids prevent the immune system from fighting a common parasitic worm called Strongyloides, and sometimes people getting treated for COVID died of Strongyloides hyperinfection. Ivermectin could prevent these deaths, which would mean fewer deaths in the treatment group than the control group, which would look like ivermectin preventing deaths from COVID in high-parasite-load areas (like the tropics) but not low-parasite-load areas (like temperate zones). This explained some of the mortality results, with the other endpoints likely being because of publication bias.
The simplest concern is that you could make chatbots write disinformation at scale. This has created a cottage industry of AI Trust And Safety people making sure their chatbot will never write arguments against COVID vaccines under any circumstances, and a secondary industry of journalists writing stories about how they overcame these safeguards and made the chatbots write arguments against COVID vaccines.
But Alex Berenson already writes arguments against COVID vaccines. He’s very good at it, much better than I expect chatbots to be for many years. Most people either haven’t read them, or have incidentally come across one or two things from his years-long corpus. The limiting factor on your exposure to arguments against COVID vaccines isn’t the existence of arguments against COVID vaccines. It’s the degree to which the combination of the media’s coverage decisions and your viewing habits causes you to see those arguments. A million mechanical Berensons churning out a million times the output wouldn’t affect that; even one Berenson already churns out more than most people ever read.
Medium Bad Scenario: Chatbots will show up in your Twitter replies and DMs, posing as friendly people trying to inform you of the dangers of COVID vaccines. If you bite, they’ll hold your hand as they walk you through anti-vaccine arguments, answering your questions and responding to your objections. Not only does this mean the disinformation will come to you (instead of you having to go to it), but it will directly target the weaknesses in your arguments and the places you’re most uncertain.
But I find myself caring less about the philosophical argument than a more emotional argument, which is - every day I see amazing people in medicine who are underappreciated. The nurses who work triple normal hours during the COVID pandemic, or doctors who go on mission trips to Africa to cure tropical diseases for zero pay. None of these people have statues; Henrietta Lacks has one in the US and another in Britain.
Prominent anti-vaccine personality Steve Kirsch offered $500k bets (a) on the result of a public debate on whether vaccines save lives or not. Rootclaim (a), a hardcore Bayesian analysis organization, put significant efforts into doing their homework, and accepted the bet (a). Then Kirsh dishonourably chickened out (a). A transcript of the negotiations can be found here (a).
UPDATE: The bet is back on! Kirsch has put in $1MM, Wilf has put in $500K, there’s room for another $500K on the pro-vaccine side (I don’t know if the debate will still go forward without it), and the terms are here.
45: New Cochrane meta-analysis finds no evidence that masks work for preventing transmission of respiratory illnesses, including COVID, but that hand-washing does.
I judge 2, 3, 4, and 6 as having happened (though 2 is confounded by COVID). 1, 5, 7, and 8 didn’t happen.
But I don’t want to take too much credit here - I was thinking of something much more obviously artificial than COVID (even if it does end up to have been a lab leak), and heavy-handed government response in the sense of cracking down on bio research. That was almost the only area in which the government’s response wasn’t heavy-handed!
Really all that this proves is that, like every rationalist, I’ve been in a constant state of mild panic about pandemic-related risks since forever. I don’t think I got any particular details of COVID right.
Finally, most of the surveys in question are just a series of basic psychology scales or tasks both the worker and average SSC reader are very familiar with. I suspect many of them are administered by students as practice rather than 'serious' research. As the other poster said, rejected HITs are just any task the requestor declines for any reason. A worker's acceptance rate is extremely important - one of the few pieces of advice Amazon seems to give requestors is to filter for 98% or 99% acceptance rate. It's probably pretty reasonable for surveys - if you can't get 99 out of 100 of those filled out acceptably (assuming good faith by the requestors), maybe you should be filtered. It's also worth noting that Amazon makes communication difficult, and that rejected HITs can only be reversed for like a month - after that, they're permanently on your record. It's also probably worth restating: if a worker goes below the high 90s, they'll have access to fewer tasks, likely from less reputable requestors, and they'll need to do 100 of these to offset every rejection. And the worker is at much greater risk of being dug deeper into that hole by requestors rejecting their work in bad faith with no recourse - part of why surveys are popular is because the IRB can bludgeon requestors into accountability. Most of the surveys in question are also are the crumbs that filter through the grasping pedipalps of the hordes of workers (and their scripts). If people are seriously using MTurk to monetize their time, they're likely looking for 'batch HITs' - the sort of thing where there's hundreds or thousands of tasks that can be quickly repeated (moderating images, 3 cents for a sentiment analysis, a couple quarters to outline a car in an image, etc.) Of course, this mana from heaven rarely lasts long, and the worker always takes a risk - 'if I do 100 of these, and this is an unscrupulous requestor, well - I better have ten thousand accepted HITs under my belt.' That's why workers are so protective of their acceptance rate. Back to surveys - again as the other poster replied, most of what the average MTurk worker will see is probably a psychology study questionnaire with a series of whatever common scales, attention checks, and other tricks the worker has probably seen at least dozens if not hundreds of times by now. They often pay Amazon's princely sum of about 10 cents per (expected) minute - based on the minimum wage in whatever benighted 00s year Amazon Mechanical Turk launched. Anecdotally, it also seems like a lot of these are from students - probably just practice research by someone who likely has less experience with the platform than the worker themselves. The problem the requestor has - at least as of ~2018 - is that there is a lot of fraud with foreign workers getting access to MTurk accounts and submitting totally garbo data, often very quickly. Based purely on a 'time to complete' metric, this is hard to distinguish from a legit worker who has filled out hundreds of these and is looking to maximize how many pennies they get for their minutes. It also wasn't uncommon for workers to 'cook' such a survey - letting it sit at the end screen before submitting - just to avoid getting pinged for finishing it quickly. As for how this all ties back into Institutional Review Boards - well, yeah, griping to the IRB is often the MTurk worker's only recourse. Amazon just doesn't care, and as I recall a lot of requestors don't even know workers can contact them - and as mentioned there's a narrow time window to discuss rejected HITs before they become permanent. On the other hand, in a lot of cases this is basically a reddit mob complaining that a student doling out dimes screwed up their understanding of MTurk's arcane inner workings, and that's in the case that the workers aren't actually trying to defraud them for said dimes. 5. Comments About Regulation, Liability, and Vetocracy CatCube writes: I think the fundamental problem is that you cannot separate the ability to make a decision from the ability to make a *wrong* decision. However, our society--pushed by the regulator/lawyer/journalist/administrator axis you discuss--tries to use detailed written rules to prevent wrong decisions from being made. But, because of the decision/wrong decision inseparability thing, the consequences are that nobody has the ability to make a decision. This is ultimately a political question. It's not wrong, precisely, or right either. It's a question of value tradeoffs. Any constraint you put on a course of action is necessarily something that you value more than the action, but this isn't something people like to admit or hear voiced aloud. If you say, "We want to make sure that no infrastructure project will drive a species to extinction", then you are saying that's more important than building infrastructure. Which can be a defensible decision! But if you keep adding stuff--we need to make sure we're not burdening certain races, we need to make sure we're getting input from each neighborhood nearby, etc.--you can eventually end up overconstraining the problem, where there turns out to be no viable path forward for a project. This is often a consequence of the detailed rules to prevent wrong decisions. But because we can't admit that we're valuing things more than building stuff (or doing medical research, I guess?), we as a society just end up sitting and stewing about how we seemingly can't do anything anymore. We need to either: 1) admit we're fine with crumbling infrastructure, so long as we don't have any environmental, social, etc., impacts; or 2) decide which of those are less important and streamline the rules, admitting that sometimes the people who are thus able to make a decision are going to screw it up and do stuff we ultimately won't like. Darwin on why safetyism expanded just as the neoliberals were trying to decrease government regulation: Without the excuse of 'we were following all of the very strict and explicit regulations, so the bad thing that happened was a freak accident and not our fault' to rely on, companies had to take safety and caution and liability limitation and PR management into their own hands in a much more serious way. And without the confidence in very strict and explicit regulations to limit the bad things companies might do, and without democratically-elected regulators as a means to bring complaint and affect change, we became much more focused on seeking remedy for corporate malfeasance by suing companies into oblivion and destroying them in the court of public opinion. Basically, government actually *can* do useful things, as it turns out. One of the useful things it can do is be a third party to a dispute between two people or entities, such as 'corporations' and 'citizens', and use it's power to legibly and credibly ensure cooperation by explicitly specifying what will be considered defection and then punishing it harshly. This actually allows the two parties, which might otherwise be in conflict, to trust each other much more and cooperate much better, because their incentives have been shifted by a third party to make defection more costly. Without government playing that role, you can fall back into bad equilibrium of distrust and warring, which in this case might look like a wary populace ready to sue and decry at the slightest excuse, and paranoid corporations going overboard on caution and PR to shield from that. Meadow Freckle writes: Why can’t you sue an IRB for killing people for blocking research? You can clearly at least sometimes activist them into changing course. But their behavior seems sue-worthy in these examples, and completely irresponsible. We have negligence laws in other areas. Is there an airtight legal case that they’re beyond suing, or is it just that nobody’s tried? I don’t know, and this seems like an important question. And Donald writes: Why do we need special rules for medicine? The law has rules about what dangerous activities people are allowed to consent to, for example in the context of dangerous sports or dangerous jobs. Criminal and civil trials in this context seem to be a fairly functional system. If Doctors do bad things, they can stand in the accused box in court and get charged with assault or murder, with the same standards applied as are applied to everyone else. If there need to be exceptions, they should be exceptions of the form "doctors have special permission to do X". I do want to slightly defend something IRB-like here. When a doctor asks you to be part of a study, they’re implicitly promising that they did their homework, this is a valuable thing to study, and that there’s no obvious reason it should be extremely unsafe. As a patient (who may be uneducated) you have no way of knowing whether or not this promise is true. Every so often, someone does everything right, and something goes wrong anyway. A drug that everyone reasonably thought would be safe and effective turns out to have unpredictable side effects - this is part of why we have to do studies in the first place. If every time this happened, a doctor had to stand trial for assault/murder, nobody would ever study new drugs. Trials are a crapshoot, and juries tend to rule against doctors on the grounds that the disabled/dead patient is very sympathetic and everyone knows doctors/hospitals are rich and can give them infinite money as damages. There is no way for an average uneducated jury to distinguish between “doctor did their homework and got unlucky” and “doctor did an idiotic thing”. Either way, the prosecution can find “expert witnesses” to testify, for money, that you were an idiot and should have known the study would fail. In order to remove this risk, you need some standards for when a study is safe, so that if people sue you, you can say “I was following the standards and everyone else agreed with me that this was good” and then the lawsuit will fail. Right now those standards are “complied with an IRB”. This book is arguing that the IRB’s standards are too high, but we can’t cut the IRB out entirely without some kind of profound reform of the very concept of lawsuits, and I don’t know what that reform would look like. 6. Comments About The Act/Omission Distinction jumpingjacksplash writes: I think you've unintentionally elided two distinct points: first, that IRBs are wildly inefficient and often pointless within the prevailing legal-moral normative system (PLMNS); second, that IRBs are at odds with utilitarianism. Law in Anglo-Saxon countries, and most people's opinions, draw a huge distinction between harming someone and not helping them. If I cut you with a knife causing a small amount of blood loss and maybe a small scar, that's a serious crime because I have an obligation not to harm you. If I see a car hurtling towards you that you've got time to escape from if you notice it, but don't shout to warn you (even if I do this because I don't like you), then that's completely fine because I have no obligation to help you. This is the answer you'd get from both Christianity and Liberalism (in the old-fashioned/European sense of the term, cf. American Right-Libertarianism). Notably, in most Anglo-Saxon legal systems, you can't consent to be caused physical injury. Under PLMNS, researchers should always ask people if they consent to using their personal data in studies which are purely comparing data and don't change how someone will be treated. For anything that affects what medical treatment someone will or won't receive, you'd at least have to give them a full account of how their treatment would be different and what the risks of that are. If there's a real risk of killing someone, or permanently disabling them, you probably shouldn't be allowed to do the study even if all the participants give their informed consent. This isn't quite Hans Jonas' position, but it cashes out pretty similarly. That isn't to say the current IRB system works fine for PLMNS purposes; obviously there's a focus on matters that are simply irrelevant to anything anyone could be rationally concerned with. But if, for example, they were putting people on a different ventilator setting than they otherwise would, and that risked killing the patient, then that probably shouldn't be allowed; the fact that it might lead to the future survival of other, unconnected people isn't a relevant consideration, and nor is "the same number of people end up on each ventilator setting, who cares which ones it is" because under PLMNS individuals aren't fungible. Under utilitarianism, you'd probably still want some sort of oversight to eliminate pointless yet harmful experiments or reduce unnecessary harm, but it's not clear why subjects' consent would ever be a relevant concern; you might not want to tell them about the worst risks of a study, as this would upset them. The threshold would be really low, because any advance in medical science could potentially last for centuries and save vastly more people than the study would ever involve. The problem is, as is always the case for utilitarianism, this binds you to some pretty nasty stuff; I can't work out whether the Tuskegee experiment's findings have saved any lives, but Mengele's research has definitely saved more people than he killed, and I'd be surprised if that didn't apply to Unit 731 as well. The utilitarian IRB would presumably sign off on those. More interestingly, it might have to object to a study where everyone gives informed consent but the risk of serious harm to subjects is pretty high, and insist that it be done on people whose quality of life will be less affected if it goes wrong (or whose lower expected utility in the longer term makes their deaths less bad) such as prisoners or the disabled. The starting point to any ideal system has to be setting out what it's trying to achieve. Granted, if you wanted reform in the utilitarian direction, you probably wouldn't advocate a fully utilitarian system due to the tendency of the general public to recoil in horror. I want to stress how far we are away from “do experiments without patient’s consent” here - a much more common problem is that patients really want to be in experiments, and the system won’t allow it. This is most classic in studies on cancer, where patients really want access to experimental drugs and IRBs are constantly coming up with reasons not to give it to them. Jonas argued that all cancer studies should be banned because it’s impossible to consent when you’re desperate to survive, which isn’t the direction I would have taken that particular example in. But there are other examples - during COVID, lots of effective altruists stepped up to be in human challenge trials that would have gotten the vaccines tested faster, but the government wouldn’t allow them to participate. I would honestly be happy with a system that counts the harm of denying a patient’s ability to consent to an experiment they really want to be in as a negative, forget about any lives saved. And JDK writes: I haven't finished reading by felt compelled to comment on this: "the stricter IRB system in place since the '90s probably only prevents a single-digit number of deaths per decade, but causes tens of thousands more by preventing lifesaving studies." No. It does NOT "cause" deaths. We can't go down this weird path of imprecision about what "causing" means. I've been examining Ivan Illich, "Medical Nemesis" recently. By claiming IRBs which stop research ostensibly CAUSE death strikes me as cultural iatrogenesis masquerading as a cure for clinical iatrogenesis. […] "Might have been saved if" is not the same as "death was caused by". This seems to me to be a weird and overly metaphysical nitpick. Suppose a surgeon is operating on someone. In the process, they must clamp a blood vessel - this is completely safe for one minute, but if they leave it clamped more than one minute, the patient dies. They clamp it as usual, but I rush into the operating room and forceably restrain the surgeon and all the staff. The surgeon is unable to remove the clamp and the patient dies. I (and probably the legal system) would like to be able to say I caused the patient’s death in this scenario. But it sounds like JDK is saying I have to say the surgeon caused the patient's death and I was only tangentially involved. Here’s another example; suppose the US government bans all food production - farmers, hunters, fishermen, etc are forbidden from doing their jobs. After a few months, everyone starves to death. I might want to say something like “the US government’s ban on food production killed people”. But by JDK’s reasoning, this is wrong - the government merely prevented farmers and fishermen from saving people (by giving them food so they didn’t starve). I might want to say something like “Mao’s collective farming policy killed lots of people”. But since this is just a weaker version of hypothetical-Biden’s ban on food, by JDK’s reasoning I can’t do this. This seems contrary to common usage, common sense, and communicating information clearly. I have never heard any philosopher or dictionary suggest this, so what exactly is the argument? (JDK has a response here, but I didn’t find it especially enlightening) 7. Comments About The Applications For AI Metaphysiocrat writes: People have joked about applying NEPA review to AI capabilities research, but I wonder if some kind of IRB model might have legs (as part of a larger package of capabilities-slowing policy.) It’s embedded in research bureaucracies, we sort of know how to subject institutions to it, and so on. I can think of seven obvious reasons this wouldn’t work, but at this point I’m getting doomery enough that I feel like we may just have to throw every snowball we have at the train on the off chance one has stopping power. Zach Stein-Perlman writes: A colleague of mine is interested in 'IRBs for AI'-- he hasn't investigated it but has thought about IRB-y stuff in the context of takeaways for AI (https://wiki.aiimpacts.org/doku.php?id=responses_to_ai:technological_inevitability:incentivized_technologies_not_pursued:vaccine_challenge_trials). He's interested in people's takes on the topic. My take: my understanding is that the US can’t technically demand all doctors use IRBs. (Almost) al doctors use IRBs for a combination of a few reasons : The US government demands that everyone who receives federal funding use an IRB, and most doctors get some federal funding.
Political liberalism vs. concern about COVID: 0.33
Source: World Bank. Britain is the thick blue line. …and it also shows UK growth being about average. So what’s going on? I asked about this in an Open Thread. Here were some of your responses. Eric Rall writes: There are two different ways of calculating real GDP per capita in an international context, both of which involve converting local currency to dollars and then inflation-adjusting the dollars based on the US's GDP deflator. One uses market exchange rates, while the other uses "Purchasing Power Parity", attempting to optimize the GDP figure as a proxy for standard-of-living by using local prices for equivalent goods and services as the currency conversion factor. For Brexit-related and COVID-related reasons, the relationship between PPP and market exchange rates for Britain have been highly unstable in the period in question: exchange rates have been very volatile (ranging from US$1.08 to US$1.40 per £1.00), and tariffs and COVID disruption have both radically changed the availability and prices of imported goods. Looking at either the PPP or market exchange rate numbers, everyone took a big hit in 2020, while Britain appears to have taken a deeper hit than France and the overall OECD average (the two control groups I picked off the top of my head). The big difference is that in market exchange rate terms, the recovery looks proportionate to the decline (i.e. Britain fell more, but also recovered proportionately faster so as to bounce back to approximately 2019 levels in 2022 the same as France and OECD): (source) But in PPP terms, the UK has recovered at the same rate as France and OECD and thus appears to have permanently (so far) lost ground in standard of living relative to other countries. UK was also growing more slowly in PPP terms between 2015 and 2019 than France, but about the same as the OECD average: (source) Putting some numbers on the second graph: Just before COVID, Britain had 106% the average OECD GDP
Just before COVID, Britain had 106% the average OECD GDP
At the peak of COVID, Britain had 101% the average OECD GDP
That's why people don't donate kidneys unless it's to their family. It's clearly risky. A bunch of discredited health people saying it's not risky isn't gonna change that - COVID showed clearly that they are the sort of people who will lie at the drop of a hat if they think it'll make people behave in ways that are somehow more "pro social" regardless of actual risk.
10: Alex Tabarrok: Don’t Let The FDA Regulate Lab Tests. The FDA does not usually regulate lab tests, but they took over this domain during the pandemic, took over this responsibility, then proceeded to bungle it so badly that American hospitals and public health departments spent months without working COVID tests long after other countries had made them cheap and easily available. Now they’re trying to take permanent control of the whole area. Don’t let them!
How many residents will live in Prospera, a new special economic zone in Honduras, on Jan 1, 2026? Answer: 600 (80% confidence interval 100-2,000) This seems like a good guess (except that my confidence interval would have included zero because there’s a 20%+ chance that it gets shut down). So overall its forecasts seem pretty impressive. But I was concerned by its reasoning even in some of the questions it got “right”. For example, the Nikki Haley question tried to get a base rate by asking what percent of elections Haley had won before, and found she had won 71% of them - these were mostly elections for South Carolina governor. You can see what the AI is trying to do - but it’s not going to work. Then it got confused and read a lot of news stories about how she’s currently losing the 2024 presidential election, and seemed to think they were about 2028. So either the AI only got a reasonable probability by coincidence, or it was testing many different strategies, throwing out the useless ones, and updating only on the useful ones, in a way that was kind of opaque to the casual reader. Still, if the company says it beats most human forecasters, this doesn’t seem totally impossible based on what I’ve seen. And that would be exciting! An AI that can generate probabilistic forecasts for any question seems like in some way a culmination of the rationalist project. And if you can make something like this work, it doesn’t sound too outlandish that you could apply the same AI to conditional forecasts, or to questions about the past and present (eg whether COVID was a lab leak). I would be most excited if at some point this graduated from its geopolitical focus and was able to answer questions on any topic (eg “what is the chance that Astral Codex Ten gains paid subscribers this year?”), maybe if the questioner gives it links or feeds it some of the appropriate information. FutureSearch is run by a team formerly from Metaculus, including former Metaculus CTO (and Google internal prediction market veteran) Dan Schwarz. They’re looking for potential clients and/or investors; if you’re interested, email hello@futuresearch.ai. Vitalik On AI Prediction Markets Vitalik Buterin, Ethereum-founder-turned-cryptocurrency-public-intellectual, has a blog post on The Promise And Challenge Of Crypto + AI Applications. One of them is a prediction market. He writes: Prediction markets have been a holy grail of epistemics technology for a long time; I was excited about using prediction markets as an input for governance ("futarchy") back in 2014, and played around with them extensively in the last election as well as more recently. But so far prediction markets have not taken off too much in practice, and there is a series of commonly given reasons why: the largest participants are often irrational, people with the right knowledge are not willing to take the time and bet unless a lot of money is involved, markets are often thin, etc. One response to this is to point to ongoing UX improvements in Polymarket or other new prediction markets, and hope that they will succeed where previous iterations have failed. After all, the story goes, people are willing to bet tens of billions on sports, so why wouldn't people throw in enough money betting on US elections or LK99 that it starts to make sense for the serious players to start coming in? But this argument must contend with the fact that, well, previous iterations have failed to get to this level of scale (at least compared to their proponents' dreams), and so it seems like you need something new to make prediction markets succeed. And so a different response is to point to one specific feature of prediction market ecosystems that we can expect to see in the 2020s that we did not see in the 2010s: the possibility of ubiquitous participation by AIs. AIs are willing to work for less than $1 per hour, and have the knowledge of an encyclopedia - and if that's not enough, they can even be integrated with real-time web search capability. If you make a market, and put up a liquidity subsidy of $50, humans will not care enough to bid, but thousands of AIs will easily swarm all over the question and make the best guess they can. The incentive to do a good job on any one question may be tiny, but the incentive to make an AI that makes good predictions in general may be in the millions. Note that potentially, you don't even need the humans to adjudicate most questions: you can use a multi-round dispute system similar to Augur or Kleros, where AIs would also be the ones participating in earlier rounds. Humans would only need to respond in those few cases where a series of escalations have taken place and large amounts of money have been committed by both sides. This is a powerful primitive, because once a "prediction market" can be made to work on such a microscopic scale, you can reuse the "prediction market" primitive for many other kinds of questions: Is this social media post acceptable under [terms of use]?
Spinning a narrative that plays fast and loose with the truth, in order to avoid “panic” or empowering “the wrong people” - for example, trying to play down concerns about COVID because that might incite mobs to attack Chinese people. This violates the usual moral rule against deception, to serve the supposed greater good of preventing the panic.
Source: NPR. To be fair, we have only the scientist’s word that this is why he had the picture. But he definitely did have it. People say it would be a surprising coincidence if a zoonotic coronavirus pandemic just so happened to start in a city with a big coronavirus research lab, and this is true. But it would be an even more surprising coincidence if a lab-leak coronavirus pandemic just so happened to first get detected at a raccoon-dog stall in a wet market! Saar: It’s not clear that the first case was at the wet market; a certain Mr. Chen, with no connection to the market, seems to have fallen sick on December 8. An SCMP article suggested there were 92 previously-undetected cases suspicious for COVID as far back as November. And even if half of the first forty universally-agreed-upon cases had market connections that means another half didn’t. There was a bias towards detecting cases at the market: because authorities thought the market was the origin, and because everyone was thinking about zoonosis after SARS1, they only screened/diagnosed people with a market connection. One of the few non-market-connected COVID cases detected during this period was only detected because he was the relative of a hospital worker; the worker noticed the signs and insisted they go to the hospital despite the lack of a wet market connection. Although the map of positive samples and cases at the market was centered near the raccoon-dog stall, that could be because that area was sampled more; it’s also close to the mahjong room, where visitors and vendors at the market would go and unwind in a tight, poorly ventilated area. The next session will focus more on the WIV, but the short version is that they were doing lots of gain of function research. So one story compatible with the evidence is that a worker at WIV got infected with their modified coronavirus and passed it to his contacts. COVID started spreading quietly a few weeks to months before the first market-related case was detected. This accounts for the 92 earlier cases, Mr. Chen’s case, and the half of officially-detected cases with no wet market association. Then an infected person went to the market, causing a super-spreader event. Some of the infected market patrons went to the hospital, where doctors traced it back to the market and told other doctors to be on the lookout for wet market patrons coming in with weird viral pneumonias. They found some, declared victory, and the few anomalies - like the hospital worker’s relative - were forgotten, or assumed to have wet market connections that nobody could find. China quashed all evidence of the lab research (as was done in previous lab leak cases, eg the USSR) so all we have is the apparent wet market links that Peter found so convincing. Peter: The supposed pre-wet-market cases are confirmed fakes. Yes, the WHO did an investigation of whether there might have been COVID cases circulating before the wet market, and identified 92 unusual pneumonias that merited further review. But their final investigation, which included testing samples from these people after good tests became available, found that none of these people really had COVID. As for Mr. Chen, he said in an interview that he was hospitalized for dental issues on December 8, caught COVID in the hospital on December 16, and then was erroneously reported as “hospitalized for COVID on December 8”. The December 16 date is after the first wet market cases. Further, it seems epidemiologically impossible for COVID to have been circulating much before the first cases were officially detected December 11. The COVID pandemic doubles every 3.5 days. So if the first infection was much earlier - let’s say November 11 - we would expect 256x as much COVID as we actually saw. Even if the first couple of cases were missed because nobody was looking for them, the number of hospitalizations, deaths, etc, in January or whenever were all consistent with the number of people you’d expect if the pandemic started in early December - and not consistent with 256x that many people. So probably we should just accept that the first reported case - a wet market vendor, December 11 - was very early in the pandemic. She wasn’t literally the first case - that would most likely have been someone who worked at the raccoon-dog shop, whose case might (like 95% of COVID cases) have been mild enough not to come to medical attention. But she was certainly very early. Although authorities eventually decided COVID spread through a wet market and started deliberately looking for wet market connections, this only happened on December 30. So the earliest cases - including the 40 very earliest cases where half came from the wet market - weren’t biased (at least not through that particular route). So the claim that “the first case, and half of the first 40 cases, had wet market connections” stands as real and convincing evidence. Although the exact center of the map of positive COVID samples in the wet market was the mahjong room, the samples taken from the mahjong room were not, themselves, positive (cf: although a low-resolution population density map of New York might show Central Park in the exact center of the population density gradient, Central Park does not itself have population). There was no real “super-spreader event” at the wet market. There was a slow burn - one case the first day, a few more the next day, a few more the day after that. It’s hard to see how a single visit from an infected lab worker could do that. So the only way it could possibly be a lab leak is if the lab leaked sometime in late November, infected exactly one lab worker, that worker went straight to the wet market, infected a vendor, then went home, quarantined, recovered, and all other cases were downstream of that first infected wet market vendor. This is unparsimonious. Saar: The only source saying that Mr. Chen got sick early was an anonymous interview. And even if he was later than the first wet market cases, nobody was able to find any wet market connections. This means that whoever infected him was earlier than the index case and not linked to the wet market. Peter argued that COVID couldn’t have been more than a few weeks old when the first wet market cases were detected. But this was based on its known doubling rate. If pre-discovery COVID had a slower doubling time than known COVID, it could have been around longer. And post-lockdown serology suggested numbers that were larger than claimed at the time. So contra Peter’s claims, the infection could have been going on longer, which wouldn’t require the first lab worker to go straight to the market. It could have been weeks. Dr. Jesse Bloom’s investigation of the wet market samples, considered the final and most conclusive, failed to find a clear connection between COVID and raccoon-dogs or any other animals. Although the concentration of positive samples seemed highest near the raccoon dog stall, if you do a formal statistical analysis of which animals’ DNA was found near COVID samples most often, raccoon dogs are near the bottom. The top is wide-mouth bass, which can’t get COVID. This is obviously contamination, probably from infected humans touching wide-mouth bass tanks or something. Although the Chinese data included a negative sample from a mahjong table, it included a mention of poultry being sold nearby, which might mean this wasn’t the mahjong room itself, but some other mahjong table at a poultry shop elsewhere in the market, and (dry) mahjong tables might not hold the virus well anyway. Peter: Raccoon-dogs were sold in various cages at various stalls, separated by air gaps big enough to present a challenge for COVID transmission, and there’s no reason to think that one raccoon-dog would automatically pass it to all the others. The statistical analysis just proves there were many raccoon-dogs who didn’t have COVID. But you only need one. The raccoon dog shop and the drain leading out of the raccoon dog shop had some of the highest positive sample rates, which is more interesting than a statistical analysis which everyone agrees must be wrong (since it favors bass). It’s unclear why the negative mahjong sample says something about poultry, but based on the stated location, it’s definitely the one in the mahjong room. Session 1.5: Lineages This was technically part of Session 2, but formed enough of a discrete topic that I found it confusing to intermix it with all the other viral genetics points. I’m spinning it out into a separate summary, but the videos are all in the next session. Yuri: The coronavirus eventually mutated into many different strains. But the first big split, seen in some of the earliest samples, is between two different sub-strains called Lineage A and Lineage B, which differ by two mutations. In these two mutations, Lineage A is the same as BANAL-52, a bat virus which is the closest-known relative of COVID, but Lineage B is different. Since COVID probably evolved from something like BANAL-52, Lineage A must have come first, spread for a while, and then gotten two new mutations, turning it into Lineage B. All of the cases at the wet market, including the first detected case, were Lineage B. Lineage A wasn’t discovered until about a week later, and none of the Lineage A patients had been to the wet market. Lineage A (left) was used by the Minoan Cretans, but has never been deciphered. Lineage B (right) was used by the Mycaeneans for lists of palace goods. This matches Saar’s story above. The lab leaked to somewhere else in Wuhan, not the wet market. The virus spread undetected in the population for a while. During this time, it mutated to Lineage B. Then one of the people with Lineage B went to the wet market and started a superspreader event. The authorities sampled the patients, found Lineage B, then started looking elsewhere. Later they detected some of the earlier Lineage A cases. The market is unlikely to be the origin of the pandemic, because the original Lineage A strain wasn’t found there. Peter: Although Lineage A is evolutionarily older, Lineage B started spreading in humans first. We know this because Lineage B is more common. Throughout the early pandemic, until the D614G variant drove all other strains extinct, a consistent 2/3 of the cases were B, compared to 1/3 A. Both strains spread at the same rate, so the best explanation is that B started earlier than A. Since COVID doubles every 3-4 days, probably Lineage B started 3-4 days earlier than Lineage A, which explains why it’s always been twice as many cases. But also, Lineage B also has more internal genetic diversity than Lineage A. In general, older viruses have more genetic diversity (the “molecular clock”). This is further evidence that B started spreading first. Pekar 2022 and Pipes 2021 do analyses with known parameters for spread rate and diversity, and find 90%+ odds that Lineage B was the first one in humans. Why did the older strain start spreading later? Probably the virus crossed from bats into raccoon-dogs on some raccoon-dog farm out in the country. It spread in the raccoon-dogs for a while, racking up mutations, including the (less mutated) Lineage A strain and the (slightly more mutated) Lineage B strain. Then several raccoon-dogs were taken to Wuhan for sale, including one with Lineage A and another with Lineage B. The one with Lineage B passed its virus to humans earlier. Then 3-4 days later, the Lineage A one passed its virus to humans. Lineage A was first found in a Wuhan neighborhood right next to the wet market (closer to the wet market than 97% of Wuhan’s population). Again, it would be a bizarre coincidence if a lab leak pandemic was first detected at a wet market. But it would be an even more bizarre coincidence if a lab leak pandemic separated into two strains, and both were first detected at a wet market! Although no known wet market cases were Lineage A, a positive Lineage A environmental sample was found at the wet market, and everyone agrees most cases went undetected. So maybe the Lineage B raccoon-dog spread its virus to a vendor, and that sub-strain mostly stayed in the market. But the Lineage A raccoon-dog spread its virus to a customer, who went back to his house nearby, and that strain spread in the neighborhoods next to the market. This is the only story that explains the evolutionary precedence of A, the greater spread and older molecular clock of B, and the fact that both strains were first found very close to the wet market. Yuri/Saar: Lineage B could be more common and diverse because it got the advantage of a super-spreader event in the wet market. There are a few scattered cases of intermediates between A and B, and a few other scattered cases of lineages that seem even more ancestral (ie closer to the bat virus) than either. This doesn’t make sense in a double spillover hypothesis. But it does make sense if the lineages separated in human transmission somewhere between the lab and the first super-spreader event at the wet market. Peter: Again, the wet market wasn’t a super-spreader event. COVID spread in the wet market at exactly its normal spread rate, doubling about once every 3.5 days. Stop calling the wet market a super-spreader event. The scattered cases of “intermediates” are sequencing errors. They were all found by the same computer software, which “autofills” unsequenced bases in a genome to the most plausible guess. Because Lineage B was already in the software, depending on which part of a Lineage A virus you sequenced, you might get one half or the other autofilled as Lineage B, which looked like an “intermediate”. We know this because all the supposed “intermediates” were partial cases sequenced by this particular software. We can confirm this by noting that there are too many intermediates! That is, where Lineage A is (T/C) and Lineage B is (C/T), the software found both (T/T) “intermediates” and (C/C) “intermediates”. But obviously there can only be one real intermediate form, and we have to dismiss one or the other. But in fact we can dismiss both, because they were both caused by the same software bug. The scattered “progenitor” cases - those closer to the ancestral bat virus than either A or B - are reversions, ie cases where a new mutation in the virus happened to hit an already-mutated base and shift it back towards the ancestral virus. We know this because all of these “progenitors” were scattered cases found months after the pandemic started, often in entirely different countries from Wuhan. If these were real progenitor viruses, they would have either fizzled out or exploded into a substantial portion of all cases, not be found one time in one guy in Malaysia. Given the number of mutations the virus developed over the course of the pandemic, it’s inevitable that some of them would be mutations that bring it closer to the original bat virus, and in fact we find the number of “progenitors” found very nicely matches the number of progenitor-appearing viruses we would expect by chance. And in many cases, we know the “progenitors” are newer than the original lineages, because they also have some of the later mutations that Lineage A or B picked up along the way, alongside their apparent ancestral-bat-virus-like mutations. Session 2: Viral Genetics Yuri: Two years before COVID, scientists at the Wuhan Institute of Virology, together with colleagues at the University of North Carolina, sent in a grant proposal for the DEFUSE program. This program, intended to locate and better understand potential future pandemic viruses, involved going into bat caves and collecting new coronaviruses. Once they had them, they would do gain-of-function: specifically, they would add a furin cleavage site to make them more infectious and see what happened. (quick interlude: COVID’s spike protein has two sections: one binds to human cells through the ACE2 receptor, the other helps fuse with the cell after binding. In order to avoid the immune system, it hides both of these into one spike. But when it reaches a cell, it needs to separate them again. It takes advantage of a human respiratory enzyme, furin, to do the separation - this also ensures that it only infects its primary target, human respiratory cells. The part of COVID that lets it get separated by furin is called the “furin cleavage site”. COVID’s bat-virus ancestors were gastrointestinal viruses; the addition of a furin cleavage site was what made them respiratory viruses.) We’ve found two close relatives of COVID: bat viruses called RATG-13 and BANAL-52. In particular, COVID looks more or less like BANAL-52 plus a furin cleavage site. There are 1500 sarbecoviruses, members of the family of viruses that includes SARS and SARS2/COVID. None of them except COVID have furin cleavage sites. BANAL-52, COVID’s closest ancestor, doesn’t even have anything resembling one that could mutate into a functional furin cleavage site like COVID’s. Instead, COVID - which mostly just resembles BANAL-52 with a few scattered single-point mutations - has twelve completely new nucleotides in a row - a fully formed furin cleavage site that came out of nowhere. There is nowhere else in the genome that COVID differs from BANAL-52 in such a profound way. It’s just BANAL-52 plus a little bit of random mutation plus a fully-formed furin cleavage site that came out of nowhere. Further, the furin cleavage site is weird. It uses the protein arginine twice. But instead of the nucleotides coding for arginine in the usual viral way, both times it uses the codons CGG - the way that higher animals code for arginine. This works fine - it’s just not how viruses do it. So the obvious conclusion is that WIV, which said in 2018 that it was going to find viruses and add furin cleavage sites to them, found a close relative of BANAL-52 and added a furin cleavage site. Since they were humans, and most familiar with the human way of encoding arginine, they added it as CGG both times. COVID seemed surprisingly optimized for infecting humans. Of fifty animals it was tested in, including the usual coronavirus intermediate hosts (pangolins, raccoon-dogs, etc), it was best at infecting human cells. Further, a virus that enters a new species will usually show a burst of mutations as it “figures out” the best way to adapt to that species’ unique biology. But COVID has had a pretty constant mutation rate in humans, from the beginning of the pandemic to the end. That suggests it was already adapted to humans. This could be because the lab screened for viruses with existing adaptations, because they passed it through humanized mice in the lab, or because it adapted in the hundreds of undetected cases that happened between the lab and detection in the wet market. Usually, research with potentially dangerous coronaviruses is done in BSL-3 or 4, ie high to very-high security. But WIV was irresponsibly doing it in BSL-2, ie medium security. The researchers weren’t even required to wear masks. In general, about 1/500 labs will leak any given pathogen they’re working on (?!). But because WIV was researching such an infectious virus in such an irresponsible way, the odds of a leak were much higher. The most likely explanation for all these facts is that WIV went ahead and did the gain-of-function research they said they were going to do (the particular DEFUSE grant proposal we know about got rejected, but it proves that Wuhan wanted to do this, and they could easily have gotten funding somewhere else, or done it out of their regular budget). They found a close relative of BANAL-52 and added a furin cleavage site as a simple twelve-nucleotide insertion, using the human method of encoding arginine that their genetic engineers were familiar with. Then it leaked, spread for a while in the general Wuhan population, and eventually made it to the wet market where it got detected. Peter: As mentioned earlier, the DEFUSE grant was rejected. Further, the grant said that the Wuhan Institute of Virology was responsible for finding the viruses, and the University of North Carolina would do all the gain-of-function research. This was a reasonable division of labor, since UNC was actually good at gain-of-function research, and WIV mostly wasn’t. They had done a few very simple gain-of-function projects before, but weren’t really set up for this particular proposal and were happy to leave it for their American colleagues. Even if WIV did try to create COVID, they couldn’t have. As Yuri said, COVID looks like BANAL-52 plus a furin cleavage site. But WIV didn’t have BANAL-52. It wasn’t discovered until after the COVID pandemic started, when scientists scoured the area for potential COVID relatives. WIV had a more distant COVID relative, RATG-13. But you can’t create COVID from RATG-13; they’re too different. You would need BANAL-52, or some as-yet-undiscovered extremely close relative. WIV had neither. Are we sure they had neither? Yes. Remember, WIV’s whole job was looking for new coronaviruses. They published lists of which ones they had found pretty regularly. They published their last list in mid-2019, just a few months before the pandemic. Although lab leak proponents claimed these lists showed weird discrepancies, this was just their inability to keep names consistent, and all the lists showed basically the same viruses (plus a few extra on the later ones, as they kept discovering more). The lists didn’t include BANAL-52 or any other suitable COVID relatives - only RATG-13, which isn’t close enough to work. Could they have been keeping their discovery of BANAL-52 secret? No. Pre-pandemic, there was nothing interesting about it; our understanding of virology wasn’t good enough to point this out as a potential pandemic candidate. WIV did its gain-of-function research openly and proudly (before the pandemic, gain-of-function wasn’t as unpopular as it is now) so it’s not like they wanted to keep it secret because they might gain-of-function it later. Their lists very clearly showed they had no virus they could create COVID from, and they had no reason to hide it if they did. COVID’s furin cleavage site is admittedly unusual. But it’s unusual in a way that looks natural rather than man-made. Labs don’t usually add furin cleavage sites through nucleotide insertions (they usually mutate what’s already there). On the other hand, viruses get weird insertions of 12+ nucleotides in nature. For example, HKU1 is another emergent Chinese coronavirus that caused a small outbreak of pneumonia in 2004. It had a 15 nucleotide insertion right next to its furin cleavage site. Later strains of COVID got further 12 - 15 nucleotide insertions. Plenty of flus have 12 to 15 nucleotide insertions compared to other earlier flu strains. Sometimes insertions happen because of a mistake in viral replication. Other times the virus gets confused between its own RNA and its host’s, and splices a bit of the host RNA into the virus. This would neatly explain why the insertion used the unusual coding CGG for arginine, which is common in animals but rare in viruses. On the other hand, it’s not that rare in viruses - COVID uses CGG for arginine about 3% of the time. And human engineers don’t necessarily use it any more than that - Peter was able to find one example of humans adding arginine to a virus, and 0 out of the 5 arginines added were CGG. COVID’s furin cleavage site is a mess. When humans are inserting furin cleavage sites into viruses for gain-of-function, the standard practice is RRKR, a very nice and simple furin cleavage site which works well. COVID uses PRRAR, a bizarre furin cleavage site which no human has ever used before, and which virologists expected to work poorly. They later found that an adjacent part of COVID’s genome twisted the protein in an unusual way that allowed PRRAR to be a viable furin cleavage site, but this discovery took a lot of computer power, and was only made after COVID became important. The Wuhan virologists supposedly doing gain-of-function research on COVID shouldn’t have known this would work. Why didn’t they just use the standard RRKR site, which would have worked better? Everyone thinks it works better! Even the virus eventually decided it worked better - sometime during the course of the pandemic, it mutated away from its weird PRRAR furin cleavage site towards a more normal form. Further, COVID’s furin cleavage site was inserted via what seems to be a frameshift mutation - it wasn’t a clean insertion of the amino acids that formed the site, it was an insertion of a sequence which changed the context of the surrounding nucleotides into the amino acids that formed the site. This is a pointless too-clever-by-half “flourish” that there would be no reason for a human engineer to do. But it’s exactly the kind of weird thing that happens in the random chance of evolution. COVID is hard to culture. If you culture it in most standard media or animals, it will quickly develop characteristic mutations. But the original Wuhan strains didn’t have these mutations. The only ways to culture it without mutations are in human airway cells, or (apparently) in live raccoon-dogs. Getting human airway cells requires a donor (ie someone who donates their body to science), and Wuhan had never done this before (it was one of the technologies only used at the superior North Carolina site). As for raccoon-dogs, it sure does seems suspicious that the virus is already suited to them. The claim that COVID is uniquely adapted to humans is false. The paper that claimed that defined how well COVID was adapted to different animals by those animals’ difference (on the relevant cell receptors) from humans. So in its methodology, humans came out #1 by default. If you don’t do that, COVID is better-adapted to many other animals. It’s not necessarily true that viruses see a burst of mutations when they enter a new host. COVID spread to deer and mink, and in neither case was there a burst of mutations. COVID has a pretty simple job of infecting respiratory cells and is already very good at it, regardless of species. In Yuri’s model, Wuhan Institute of Virology picked up a discarded grant and decided to do the gain-of-function half allotted to a different university, despite their relative inexperience. They skipped over all the SARS-like viruses they were supposed to work on, and all the standard gain-of-function model backbones, in favor of BANAL-52, a virus which would not be discovered for another two years, but which they somehow had samples of, which they had for some reason decided to keep secret despite its total lack of interestingness. Then they would have had to eschew all usual gain-of-function practices in favor of inserting a weird furin cleavage site that shouldn’t have worked according to the theory they had at the time, via a frameshift mutation. Then they would have had to culture it, a technique beyond their limited capabilities. Then it would have had to leak, and magically show up again in front of the raccoon-dog stall at a wet market. Yuri: WIV wouldn’t have needed to keep BANAL-52 “secret” in some kind of sinister way. Plenty of researchers have backlogs of work they haven’t published yet. Probably they a found BANAL relative in one of their normal sampling trips, did some preliminary studies on it, and planned to publish it later once they cleaned up their data. Everyone works like this. The part of DEFUSE saying that they would only work on viruses that were 95% similar to SARS is unclear and might mean something else. It looks more like they say they’ll start with those viruses, but also do some work on novel viruses. BANAL-52 could have been one of the novel viruses. The furin cleavage site is weird, but the researchers might have done that on purpose, to make the virus easier to keep track of, or to test different furin cleavage sites. Depending on the exact BANAL-52 relative they used, it might not even be a frameshift; there’s a particular way to spell serine that would make the insertion more natural. The claims that COVID can’t be cultured in normal media are based on speculative original research by Peter and might not hold up. Peter: WIV did most of its virus-gathering in a trip to a Yunnan cave between 2010 and 2015. All those viruses have long since been processed and added to the database. There’s no sign that they made more trips to Yunnan caves, and no reason for them to keep that secret. So the idea that they might just have some new viruses they didn’t publish doesn’t hold up. But suppose they did make more trips. Given the amount of time between the DEFUSE proposal and COVID, if they kept to their normal virus-collection rate, they would have gotten about thirty new viruses. What’s the chance that one of those was BANAL-52? There are thousands of bat viruses, and BANAL-52 is so rare that it wasn’t found until well after the pandemic started and people were looking for it very hard. So the chance that one of their 30 would be BANAL-52 is low. Also, they said in DEFUSE that they planned to go back to the same Yunnan cave. But BANAL-52 was found far away from that cave, so unless it ranged over a wide area, they probably couldn’t have found it even if they got very lucky. Session 3: Closing Arguments This third debate was supposed to be about “inference”, ie how much Bayesian evidence was provided by each of the facts given so far, and how to fit them into the Rootclaim probabilistic model. I’m going to relegate my summary of the more probabilistic half to the next section of this post, and just include the closing arguments here. Saar: Peter’s case hinges on the idea that it’s very improbable that a lab leak pandemic would first show up at a wet market. But this isn’t necessarily improbable. The Huanan Seafood Market had several factors that made it a likely location for a superspreader event. It was busy, with over 10,000 visitors a day. Many of the people there (eg the 1,000 vendors) came back daily, letting them reinfect each other. It had poor ventilation, especially in the high-positivity area near the raccoon-dog stall. It had cold wet surfaces on which the virus could survive for long periods. It was indoors, which prevented UV light from killing the virus. Given a small amount of sporadic COVID going around Wuhan, it’s not surprising for the first place it started spreading en masse to be a wet market. In fact, we have several examples of this. When China was COVID Zero, there would occasionally be small outbreaks that the authorities would have to contain. Most of these were at wet markets. For example, the big COVID outbreak in Beijing started at Xinfadi Market, their local seafood market. This couldn’t be an animal spillover, because there were no raccoon-dogs or other weird wildlife there. So it must be that wet markets are natural places for superspreader events. There are several other examples, which make up about half of the total outbreaks in Zero COVID era China, plus others in Singapore and Thailand. Since COVID clusters concentrate in wet markets even when there is no animal spillover, we should accept this as a property of the virus, and not attribute any significance to the fact that this happened in Wuhan too. Peter: About 1/10,000 citizens of Wuhan was a wet market vendor. So there’s a 1/10,000 chance that the first known COVID case should be a wet market vendor by chance alone. Weibo lists the most popular places for people to check in to their network on their phones, and the wet market was the 1600th most popular place in Wuhan, meaning that if you weight locations by busy-ness, there’s a less than 1/1600 chance that the first cases would be in the wet market. Yes, the wet market is indoors, has mediocre ventilation, has repeat visitors, etc. So do thousands of other places in Wuhan, like schools, hospitals, workplaces, places of worship. The wet market isn’t special in any way. And again, it wasn’t a superspreader event! COVID spread at the same rate in the wet market as it does everywhere else: doubling once per 3.5 days. It doesn’t matter what kinds of arguments you can come up with for why the wet market should have been the perfect superspreader event location, we can look at it and see that it wasn’t. It’s an environment that spreads COVID at exactly the normal rate. Zero COVID era Chinese outbreaks were concentrated in wet markets because they received infected animal products. We know why there was an outbreak in the Xinfadi Market in Beijing: it was because the seafood stall got frozen fish from some non-Zero-COVID country, the fish had COVID particles on it, and the vendor got infected and spread it to everyone else. Something like this is true for the other Chinese wet market based outbreaks we know about it. So this makes the opposite point you think it does: wet markets start outbreaks because there are infected goods being sold there. Then the virus spreads through the wet market at a completely normal rate. Saar: The Weibo list of 1600 places bigger than the wet market is likely inaccurate, because it's based on check-in data and people don't check in to seafood markets. Most of those 1600 places aren't amenable to superspread. The 70 markets supposedly bigger than Huanan are irrelevant, because they're supermarkets, open air markets, etc. Huanan is the largest seafood market in central China, and a more likely place for the first cluster of cases to be noticed. Markets weren't a common spillover location in SARS1, so the zoonosis hypothesis hasn't "called" this event in a way that should give them a high Bayes factor. And there’s still plenty of evidence for isolated (though not super-spreading) pre-market cases. A British expatriate in Wuhan, Connor Reed, says he got sick in November, three weeks before the first wet market case. Later the hospital tested his samples and said it was COVID. Another paper reports 90 cases before the first wet market one. Peter: Connor Reed was lying. The case wasn’t reported in any peer-reviewed paper. It was reported in the tabloid The Daily Mail, months after it supposedly happened. He also told the Mail that his cat died of coronavirus too, which is rare-to-impossible. Also, to get a positive hospital test, he would have had to go to the hospital, but he was 25 years old and almost no 25-year-olds go to the hospital for coronavirus. His only evidence that it was COVID was that two months later, the hospital supposedly “notified” him that it was. The hospital never informed anyone else of this extremely surprising fact which would be the biggest scientific story of the year if true. So probably he was lying. Incidentally, he died of a drug overdose shortly after giving the Mail that story; while not all drug addicts are liars, given all the other implausibilities in his story, this certainly doesn’t make him seem more credible. And in any case, he claimed he got his case at a market “like in the media” The other 90 cases are also fake. A lab leak guy found a paper that mentioned 90 more cases than other papers, and made up a conspiracy theory where the author was trying to secretly communicate that there had been 90 secret cases before any of the confirmed cases, even though there was nothing about this in the text of the paper. But actually that paper just counted cases differently than other papers, and they were referring to normal cases after the pandemic officially started. Again, I’ll come back to the discussion about inference later, but for now, here’s a table of both sides’ reasoning. This exact presentation comparing both analyses is mine3, but you can see Saar’s version here, and Peter’s starting at 45:33 of this video. Slightly made up; the two sides didn’t express their probabilities in the same way and I had to make editorial decisions to match them. Note that these aren't entirely comparable because Peter is being laxer about out-of-model probability than Saar. Although Saar's final odds here are 533-to-1, this just the central estimate. Rootclaim’s real final probability is 94% lab leak. You can see their analysis here. And The Winner Is . . . … … … … … Peter and the zoonosis hypothesis. This was a decisive victory. There were two judges, who each gave separate verdicts (or were allowed to declare a draw). Both judges decided in favor of Peter. You can see the judges’ own summary of their reasoning here (Will, Eric) Manifold agreed with the judges. There was a prediction market on who would win. It started out 70-30 in favor of lab leak. As the videos came out, zoonosis started doing better and better. I don’t want to take the exact final numbers too seriously, since I think some of the later price increases involved hints from the participants’ behavior. But it’s clear which way viewers thought the wind was blowing4. Around the same time, the Good Judgment Project - Philip Tetlock’s group studying superforecasters - put out a report on the lab leak hypothesis. After studying it in depth, his forecasters ended up 75-25 in favor of zoonosis. The Rootclaim debate was one of ten sources they said they found especially interesting. And also around the same time, and unrelated to any of this, the Global Catastrophic Risks Institute surveyed experts (“168 virologists, infectious disease epidemiologists, and other scientists from 47 countries”) and found the same thing (though see here for some potential problems with the survey): For what it’s worth, I was close to 50-50 before the debate, and now I’m 90-10 in favor of zoonosis. III. The Math And The Aftermath The third debate session was about “inference”, how to put evidence together. I put this part off until after disclosing the winner, because I wanted to talk about some of these issues at more length. The Math: Judges Both judges included a probabilistic analysis in their written decision. Here’s the same table as above, expanded to add the judges: I shoehorned the judges’ factors into the categories I already had; some of them were actually subtly different from Peter’s, Saar’s, and each other’s. The “priors” category is especially a mess here. We’ll go over these later, but I get the impression that they both thought of probabilistic analyses as an afterthought. For example, Judge Eric wrote 30,000 words about which considerations moved him, and only then includes the analysis, saying: I am not convinced that this Bayesian calculation is even an appropriate way to estimate the relative posterior probability of Z and LL; it just seemed fair that after criticizing Rootclaim’s calculations at length I should make an attempt at it myself. Judge Will’s decision ran to 10,000 words. He said he independently tried both reasoning it out intuitively, and running the Bayesian analysis, and was relieved when these two methods returned the same result. He said: I am skeptical that the Bayesian decision making/evaluation methods are any more "objective" than [intuitive reasoning]. I think they maximize legibility, not objectivity, and tend to hide the intuitive/heuristic portion in the data inclusion step and values, where it’s harder to see . . . I am not skilled in the Bayesian method, and I am sure I made significant mistakes. More time and practice would improve and refine my estimates. At the fundamental rules of the universe level, Bayesian analysis must be the best way to evaluate evidence. However, I am unsure that it’s a good strategy for a human given our cognitive limitations, and doubly unsure it’s truly being used (in the dispassionate sense) where the outcome is social desirability/fame/Twitter likes. I’m focusing on this because Saar’s opinion is that the debate went wrong (for his side) because he didn’t realize the judges were going to use Bayesian math, they did the math wrong (because Saar hadn’t done enough work explaining how to do it right), and so they got the wrong answer. I want to discuss the math errors he thinks the judges made, but this discussion would be incomplete without mentioning that the judges themselves say the numbers were only a supplement for their intuitive reasoning. That having been said, let’s look deeper into some of Saar’s concerns. The Math: Extreme Odds Saar complained that Peter’s odds were too extreme. For example, Peter said there was only a 1/10,000 chance that a lab leak pandemic would first show up at a wet market. Peter’s argument went something like: obviously a zoonotic pandemic would start at a site selling weird animals. But a lab leak pandemic - if it didn’t start at the lab - could show up anywhere. 1/10,000 Wuhan citizens work at the wet market. So if a lab leak was going to show up somewhere random, the wet market was a 1/10,000 chance. Saar had specific arguments against this, but he also had a more general argument: you should rarely see odds like 1/10,000 outside of well-understood domains. In his blog post, he gave this example: A prosecutor shows the court a statistical analysis of which DNA markers matched the defendant and their prevalence, arriving at a 1E-9 probability they would all match a random person, implying a Bayes factor near 1E9 for guilty. But if we try to estimate p(DNA|~guilty) by truly assuming innocence, it is immediately evident how ridiculous it is to claim only 1 out of a billion innocent suspects will have a DNA match to the crime scene. There are obviously far better explanations like a lab mistake, framing, an object of the suspect being brought by someone to the scene, etc. So the real p(wet market|lab leak) isn’t the 1/10,000 chance a pandemic arising in a random place hits the wet market, but the (higher?) probability that there’s something wrong with Peter’s argument. Then Saar tried to show specific things that might be wrong with Peter’s argument. I didn’t find his specific examples convincing. But maybe the question shouldn’t be whether I agreed with him. It should be whether I’m so confident he’s wrong that I would give it 10,000-to-1 odds. This makes total sense, it’s absolutely true, and I want to be really, really careful with it. If you take this kind of reasoning too far, you can convince yourself that the sun won’t rise tomorrow morning. All you have to do is propose 100 different reasons the sunrise might not happen. For example: The sun might go nova.
The southwest corner is where most of the wildlife was being sold. Rumor said that included a stall with raccoon-dogs, an animal which is generally teeming with weird coronaviruses, and is a plausible intermediate host between humans and bats: Awwww, come on, you can’t stay mad at this little guy. China said this rumor was false and refused to release any information. Scientists were finally able to confirm the existence of the raccoon-dog shop in the funniest possible way: a virologist had visited Wuhan in 2014, saw the awful conditions in the shop, and took a picture as an example of the kind of place that a future pandemic might start. Source: NPR. To be fair, we have only the scientist’s word that this is why he had the picture. But he definitely did have it. People say it would be a surprising coincidence if a zoonotic coronavirus pandemic just so happened to start in a city with a big coronavirus research lab, and this is true. But it would be an even more surprising coincidence if a lab-leak coronavirus pandemic just so happened to first get detected at a raccoon-dog stall in a wet market! Saar: It’s not clear that the first case was at the wet market; a certain Mr. Chen, with no connection to the market, seems to have fallen sick on December 8. An SCMP article suggested there were 92 previously-undetected cases suspicious for COVID as far back as November. And even if half of the first forty universally-agreed-upon cases had market connections that means another half didn’t. There was a bias towards detecting cases at the market: because authorities thought the market was the origin, and because everyone was thinking about zoonosis after SARS1, they only screened/diagnosed people with a market connection. One of the few non-market-connected COVID cases detected during this period was only detected because he was the relative of a hospital worker; the worker noticed the signs and insisted they go to the hospital despite the lack of a wet market connection. Although the map of positive samples and cases at the market was centered near the raccoon-dog stall, that could be because that area was sampled more; it’s also close to the mahjong room, where visitors and vendors at the market would go and unwind in a tight, poorly ventilated area. The next session will focus more on the WIV, but the short version is that they were doing lots of gain of function research. So one story compatible with the evidence is that a worker at WIV got infected with their modified coronavirus and passed it to his contacts. COVID started spreading quietly a few weeks to months before the first market-related case was detected. This accounts for the 92 earlier cases, Mr. Chen’s case, and the half of officially-detected cases with no wet market association. Then an infected person went to the market, causing a super-spreader event. Some of the infected market patrons went to the hospital, where doctors traced it back to the market and told other doctors to be on the lookout for wet market patrons coming in with weird viral pneumonias. They found some, declared victory, and the few anomalies - like the hospital worker’s relative - were forgotten, or assumed to have wet market connections that nobody could find. China quashed all evidence of the lab research (as was done in previous lab leak cases, eg the USSR) so all we have is the apparent wet market links that Peter found so convincing. Peter: The supposed pre-wet-market cases are confirmed fakes. Yes, the WHO did an investigation of whether there might have been COVID cases circulating before the wet market, and identified 92 unusual pneumonias that merited further review. But their final investigation, which included testing samples from these people after good tests became available, found that none of these people really had COVID. As for Mr. Chen, he said in an interview that he was hospitalized for dental issues on December 8, caught COVID in the hospital on December 16, and then was erroneously reported as “hospitalized for COVID on December 8”. The December 16 date is after the first wet market cases. Further, it seems epidemiologically impossible for COVID to have been circulating much before the first cases were officially detected December 11. The COVID pandemic doubles every 3.5 days. So if the first infection was much earlier - let’s say November 11 - we would expect 256x as much COVID as we actually saw. Even if the first couple of cases were missed because nobody was looking for them, the number of hospitalizations, deaths, etc, in January or whenever were all consistent with the number of people you’d expect if the pandemic started in early December - and not consistent with 256x that many people. So probably we should just accept that the first reported case - a wet market vendor, December 11 - was very early in the pandemic. She wasn’t literally the first case - that would most likely have been someone who worked at the raccoon-dog shop, whose case might (like 95% of COVID cases) have been mild enough not to come to medical attention. But she was certainly very early. Although authorities eventually decided COVID spread through a wet market and started deliberately looking for wet market connections, this only happened on December 30. So the earliest cases - including the 40 very earliest cases where half came from the wet market - weren’t biased (at least not through that particular route). So the claim that “the first case, and half of the first 40 cases, had wet market connections” stands as real and convincing evidence. Although the exact center of the map of positive COVID samples in the wet market was the mahjong room, the samples taken from the mahjong room were not, themselves, positive (cf: although a low-resolution population density map of New York might show Central Park in the exact center of the population density gradient, Central Park does not itself have population). There was no real “super-spreader event” at the wet market. There was a slow burn - one case the first day, a few more the next day, a few more the day after that. It’s hard to see how a single visit from an infected lab worker could do that. So the only way it could possibly be a lab leak is if the lab leaked sometime in late November, infected exactly one lab worker, that worker went straight to the wet market, infected a vendor, then went home, quarantined, recovered, and all other cases were downstream of that first infected wet market vendor. This is unparsimonious. Saar: The only source saying that Mr. Chen got sick early was an anonymous interview. And even if he was later than the first wet market cases, nobody was able to find any wet market connections. This means that whoever infected him was earlier than the index case and not linked to the wet market. Peter argued that COVID couldn’t have been more than a few weeks old when the first wet market cases were detected. But this was based on its known doubling rate. If pre-discovery COVID had a slower doubling time than known COVID, it could have been around longer. And post-lockdown serology suggested numbers that were larger than claimed at the time. So contra Peter’s claims, the infection could have been going on longer, which wouldn’t require the first lab worker to go straight to the market. It could have been weeks. Dr. Jesse Bloom’s investigation of the wet market samples, considered the final and most conclusive, failed to find a clear connection between COVID and raccoon-dogs or any other animals. Although the concentration of positive samples seemed highest near the raccoon dog stall, if you do a formal statistical analysis of which animals’ DNA was found near COVID samples most often, raccoon dogs are near the bottom. The top is wide-mouth bass, which can’t get COVID. This is obviously contamination, probably from infected humans touching wide-mouth bass tanks or something. Although the Chinese data included a negative sample from a mahjong table, it included a mention of poultry being sold nearby, which might mean this wasn’t the mahjong room itself, but some other mahjong table at a poultry shop elsewhere in the market, and (dry) mahjong tables might not hold the virus well anyway. Peter: Raccoon-dogs were sold in various cages at various stalls, separated by air gaps big enough to present a challenge for COVID transmission, and there’s no reason to think that one raccoon-dog would automatically pass it to all the others. The statistical analysis just proves there were many raccoon-dogs who didn’t have COVID. But you only need one. The raccoon dog shop and the drain leading out of the raccoon dog shop had some of the highest positive sample rates, which is more interesting than a statistical analysis which everyone agrees must be wrong (since it favors bass). It’s unclear why the negative mahjong sample says something about poultry, but based on the stated location, it’s definitely the one in the mahjong room. Session 1.5: Lineages This was technically part of Session 2, but formed enough of a discrete topic that I found it confusing to intermix it with all the other viral genetics points. I’m spinning it out into a separate summary, but the videos are all in the next session. Yuri: The coronavirus eventually mutated into many different strains. But the first big split, seen in some of the earliest samples, is between two different sub-strains called Lineage A and Lineage B, which differ by two mutations. In these two mutations, Lineage A is the same as BANAL-52, a bat virus which is the closest-known relative of COVID, but Lineage B is different. Since COVID probably evolved from something like BANAL-52, Lineage A must have come first, spread for a while, and then gotten two new mutations, turning it into Lineage B. All of the cases at the wet market, including the first detected case, were Lineage B. Lineage A wasn’t discovered until about a week later, and none of the Lineage A patients had been to the wet market. Lineage A (left) was used by the Minoan Cretans, but has never been deciphered. Lineage B (right) was used by the Mycaeneans for lists of palace goods. This matches Saar’s story above. The lab leaked to somewhere else in Wuhan, not the wet market. The virus spread undetected in the population for a while. During this time, it mutated to Lineage B. Then one of the people with Lineage B went to the wet market and started a superspreader event. The authorities sampled the patients, found Lineage B, then started looking elsewhere. Later they detected some of the earlier Lineage A cases. The market is unlikely to be the origin of the pandemic, because the original Lineage A strain wasn’t found there. Peter: Although Lineage A is evolutionarily older, Lineage B started spreading in humans first. We know this because Lineage B is more common. Throughout the early pandemic, until the D614G variant drove all other strains extinct, a consistent 2/3 of the cases were B, compared to 1/3 A. Both strains spread at the same rate, so the best explanation is that B started earlier than A. Since COVID doubles every 3-4 days, probably Lineage B started 3-4 days earlier than Lineage A, which explains why it’s always been twice as many cases. But also, Lineage B also has more internal genetic diversity than Lineage A. In general, older viruses have more genetic diversity (the “molecular clock”). This is further evidence that B started spreading first. Pekar 2022 and Pipes 2021 do analyses with known parameters for spread rate and diversity, and find 90%+ odds that Lineage B was the first one in humans. Why did the older strain start spreading later? Probably the virus crossed from bats into raccoon-dogs on some raccoon-dog farm out in the country. It spread in the raccoon-dogs for a while, racking up mutations, including the (less mutated) Lineage A strain and the (slightly more mutated) Lineage B strain. Then several raccoon-dogs were taken to Wuhan for sale, including one with Lineage A and another with Lineage B. The one with Lineage B passed its virus to humans earlier. Then 3-4 days later, the Lineage A one passed its virus to humans. Lineage A was first found in a Wuhan neighborhood right next to the wet market (closer to the wet market than 97% of Wuhan’s population). Again, it would be a bizarre coincidence if a lab leak pandemic was first detected at a wet market. But it would be an even more bizarre coincidence if a lab leak pandemic separated into two strains, and both were first detected at a wet market! Although no known wet market cases were Lineage A, a positive Lineage A environmental sample was found at the wet market, and everyone agrees most cases went undetected. So maybe the Lineage B raccoon-dog spread its virus to a vendor, and that sub-strain mostly stayed in the market. But the Lineage A raccoon-dog spread its virus to a customer, who went back to his house nearby, and that strain spread in the neighborhoods next to the market. This is the only story that explains the evolutionary precedence of A, the greater spread and older molecular clock of B, and the fact that both strains were first found very close to the wet market. Yuri/Saar: Lineage B could be more common and diverse because it got the advantage of a super-spreader event in the wet market. There are a few scattered cases of intermediates between A and B, and a few other scattered cases of lineages that seem even more ancestral (ie closer to the bat virus) than either. This doesn’t make sense in a double spillover hypothesis. But it does make sense if the lineages separated in human transmission somewhere between the lab and the first super-spreader event at the wet market. Peter: Again, the wet market wasn’t a super-spreader event. COVID spread in the wet market at exactly its normal spread rate, doubling about once every 3.5 days. Stop calling the wet market a super-spreader event. The scattered cases of “intermediates” are sequencing errors. They were all found by the same computer software, which “autofills” unsequenced bases in a genome to the most plausible guess. Because Lineage B was already in the software, depending on which part of a Lineage A virus you sequenced, you might get one half or the other autofilled as Lineage B, which looked like an “intermediate”. We know this because all the supposed “intermediates” were partial cases sequenced by this particular software. We can confirm this by noting that there are too many intermediates! That is, where Lineage A is (T/C) and Lineage B is (C/T), the software found both (T/T) “intermediates” and (C/C) “intermediates”. But obviously there can only be one real intermediate form, and we have to dismiss one or the other. But in fact we can dismiss both, because they were both caused by the same software bug. The scattered “progenitor” cases - those closer to the ancestral bat virus than either A or B - are reversions, ie cases where a new mutation in the virus happened to hit an already-mutated base and shift it back towards the ancestral virus. We know this because all of these “progenitors” were scattered cases found months after the pandemic started, often in entirely different countries from Wuhan. If these were real progenitor viruses, they would have either fizzled out or exploded into a substantial portion of all cases, not be found one time in one guy in Malaysia. Given the number of mutations the virus developed over the course of the pandemic, it’s inevitable that some of them would be mutations that bring it closer to the original bat virus, and in fact we find the number of “progenitors” found very nicely matches the number of progenitor-appearing viruses we would expect by chance. And in many cases, we know the “progenitors” are newer than the original lineages, because they also have some of the later mutations that Lineage A or B picked up along the way, alongside their apparent ancestral-bat-virus-like mutations. Session 2: Viral Genetics Yuri: Two years before COVID, scientists at the Wuhan Institute of Virology, together with colleagues at the University of North Carolina, sent in a grant proposal for the DEFUSE program. This program, intended to locate and better understand potential future pandemic viruses, involved going into bat caves and collecting new coronaviruses. Once they had them, they would do gain-of-function: specifically, they would add a furin cleavage site to make them more infectious and see what happened. (quick interlude: COVID’s spike protein has two sections: one binds to human cells through the ACE2 receptor, the other helps fuse with the cell after binding. In order to avoid the immune system, it hides both of these into one spike. But when it reaches a cell, it needs to separate them again. It takes advantage of a human respiratory enzyme, furin, to do the separation - this also ensures that it only infects its primary target, human respiratory cells. The part of COVID that lets it get separated by furin is called the “furin cleavage site”. COVID’s bat-virus ancestors were gastrointestinal viruses; the addition of a furin cleavage site was what made them respiratory viruses.) We’ve found two close relatives of COVID: bat viruses called RATG-13 and BANAL-52. In particular, COVID looks more or less like BANAL-52 plus a furin cleavage site. There are 1500 sarbecoviruses, members of the family of viruses that includes SARS and SARS2/COVID. None of them except COVID have furin cleavage sites. BANAL-52, COVID’s closest ancestor, doesn’t even have anything resembling one that could mutate into a functional furin cleavage site like COVID’s. Instead, COVID - which mostly just resembles BANAL-52 with a few scattered single-point mutations - has twelve completely new nucleotides in a row - a fully formed furin cleavage site that came out of nowhere. There is nowhere else in the genome that COVID differs from BANAL-52 in such a profound way. It’s just BANAL-52 plus a little bit of random mutation plus a fully-formed furin cleavage site that came out of nowhere. Further, the furin cleavage site is weird. It uses the protein arginine twice. But instead of the nucleotides coding for arginine in the usual viral way, both times it uses the codons CGG - the way that higher animals code for arginine. This works fine - it’s just not how viruses do it. So the obvious conclusion is that WIV, which said in 2018 that it was going to find viruses and add furin cleavage sites to them, found a close relative of BANAL-52 and added a furin cleavage site. Since they were humans, and most familiar with the human way of encoding arginine, they added it as CGG both times. COVID seemed surprisingly optimized for infecting humans. Of fifty animals it was tested in, including the usual coronavirus intermediate hosts (pangolins, raccoon-dogs, etc), it was best at infecting human cells. Further, a virus that enters a new species will usually show a burst of mutations as it “figures out” the best way to adapt to that species’ unique biology. But COVID has had a pretty constant mutation rate in humans, from the beginning of the pandemic to the end. That suggests it was already adapted to humans. This could be because the lab screened for viruses with existing adaptations, because they passed it through humanized mice in the lab, or because it adapted in the hundreds of undetected cases that happened between the lab and detection in the wet market. Usually, research with potentially dangerous coronaviruses is done in BSL-3 or 4, ie high to very-high security. But WIV was irresponsibly doing it in BSL-2, ie medium security. The researchers weren’t even required to wear masks. In general, about 1/500 labs will leak any given pathogen they’re working on (?!). But because WIV was researching such an infectious virus in such an irresponsible way, the odds of a leak were much higher. The most likely explanation for all these facts is that WIV went ahead and did the gain-of-function research they said they were going to do (the particular DEFUSE grant proposal we know about got rejected, but it proves that Wuhan wanted to do this, and they could easily have gotten funding somewhere else, or done it out of their regular budget). They found a close relative of BANAL-52 and added a furin cleavage site as a simple twelve-nucleotide insertion, using the human method of encoding arginine that their genetic engineers were familiar with. Then it leaked, spread for a while in the general Wuhan population, and eventually made it to the wet market where it got detected. Peter: As mentioned earlier, the DEFUSE grant was rejected. Further, the grant said that the Wuhan Institute of Virology was responsible for finding the viruses, and the University of North Carolina would do all the gain-of-function research. This was a reasonable division of labor, since UNC was actually good at gain-of-function research, and WIV mostly wasn’t. They had done a few very simple gain-of-function projects before, but weren’t really set up for this particular proposal and were happy to leave it for their American colleagues. Even if WIV did try to create COVID, they couldn’t have. As Yuri said, COVID looks like BANAL-52 plus a furin cleavage site. But WIV didn’t have BANAL-52. It wasn’t discovered until after the COVID pandemic started, when scientists scoured the area for potential COVID relatives. WIV had a more distant COVID relative, RATG-13. But you can’t create COVID from RATG-13; they’re too different. You would need BANAL-52, or some as-yet-undiscovered extremely close relative. WIV had neither. Are we sure they had neither? Yes. Remember, WIV’s whole job was looking for new coronaviruses. They published lists of which ones they had found pretty regularly. They published their last list in mid-2019, just a few months before the pandemic. Although lab leak proponents claimed these lists showed weird discrepancies, this was just their inability to keep names consistent, and all the lists showed basically the same viruses (plus a few extra on the later ones, as they kept discovering more). The lists didn’t include BANAL-52 or any other suitable COVID relatives - only RATG-13, which isn’t close enough to work. Could they have been keeping their discovery of BANAL-52 secret? No. Pre-pandemic, there was nothing interesting about it; our understanding of virology wasn’t good enough to point this out as a potential pandemic candidate. WIV did its gain-of-function research openly and proudly (before the pandemic, gain-of-function wasn’t as unpopular as it is now) so it’s not like they wanted to keep it secret because they might gain-of-function it later. Their lists very clearly showed they had no virus they could create COVID from, and they had no reason to hide it if they did. COVID’s furin cleavage site is admittedly unusual. But it’s unusual in a way that looks natural rather than man-made. Labs don’t usually add furin cleavage sites through nucleotide insertions (they usually mutate what’s already there). On the other hand, viruses get weird insertions of 12+ nucleotides in nature. For example, HKU1 is another emergent Chinese coronavirus that caused a small outbreak of pneumonia in 2004. It had a 15 nucleotide insertion right next to its furin cleavage site. Later strains of COVID got further 12 - 15 nucleotide insertions. Plenty of flus have 12 to 15 nucleotide insertions compared to other earlier flu strains. Sometimes insertions happen because of a mistake in viral replication. Other times the virus gets confused between its own RNA and its host’s, and splices a bit of the host RNA into the virus. This would neatly explain why the insertion used the unusual coding CGG for arginine, which is common in animals but rare in viruses. On the other hand, it’s not that rare in viruses - COVID uses CGG for arginine about 3% of the time. And human engineers don’t necessarily use it any more than that - Peter was able to find one example of humans adding arginine to a virus, and 0 out of the 5 arginines added were CGG. COVID’s furin cleavage site is a mess. When humans are inserting furin cleavage sites into viruses for gain-of-function, the standard practice is RRKR, a very nice and simple furin cleavage site which works well. COVID uses PRRAR, a bizarre furin cleavage site which no human has ever used before, and which virologists expected to work poorly. They later found that an adjacent part of COVID’s genome twisted the protein in an unusual way that allowed PRRAR to be a viable furin cleavage site, but this discovery took a lot of computer power, and was only made after COVID became important. The Wuhan virologists supposedly doing gain-of-function research on COVID shouldn’t have known this would work. Why didn’t they just use the standard RRKR site, which would have worked better? Everyone thinks it works better! Even the virus eventually decided it worked better - sometime during the course of the pandemic, it mutated away from its weird PRRAR furin cleavage site towards a more normal form. Further, COVID’s furin cleavage site was inserted via what seems to be a frameshift mutation - it wasn’t a clean insertion of the amino acids that formed the site, it was an insertion of a sequence which changed the context of the surrounding nucleotides into the amino acids that formed the site. This is a pointless too-clever-by-half “flourish” that there would be no reason for a human engineer to do. But it’s exactly the kind of weird thing that happens in the random chance of evolution. COVID is hard to culture. If you culture it in most standard media or animals, it will quickly develop characteristic mutations. But the original Wuhan strains didn’t have these mutations. The only ways to culture it without mutations are in human airway cells, or (apparently) in live raccoon-dogs. Getting human airway cells requires a donor (ie someone who donates their body to science), and Wuhan had never done this before (it was one of the technologies only used at the superior North Carolina site). As for raccoon-dogs, it sure does seems suspicious that the virus is already suited to them. The claim that COVID is uniquely adapted to humans is false. The paper that claimed that defined how well COVID was adapted to different animals by those animals’ difference (on the relevant cell receptors) from humans. So in its methodology, humans came out #1 by default. If you don’t do that, COVID is better-adapted to many other animals. It’s not necessarily true that viruses see a burst of mutations when they enter a new host. COVID spread to deer and mink, and in neither case was there a burst of mutations. COVID has a pretty simple job of infecting respiratory cells and is already very good at it, regardless of species. In Yuri’s model, Wuhan Institute of Virology picked up a discarded grant and decided to do the gain-of-function half allotted to a different university, despite their relative inexperience. They skipped over all the SARS-like viruses they were supposed to work on, and all the standard gain-of-function model backbones, in favor of BANAL-52, a virus which would not be discovered for another two years, but which they somehow had samples of, which they had for some reason decided to keep secret despite its total lack of interestingness. Then they would have had to eschew all usual gain-of-function practices in favor of inserting a weird furin cleavage site that shouldn’t have worked according to the theory they had at the time, via a frameshift mutation. Then they would have had to culture it, a technique beyond their limited capabilities. Then it would have had to leak, and magically show up again in front of the raccoon-dog stall at a wet market. Yuri: WIV wouldn’t have needed to keep BANAL-52 “secret” in some kind of sinister way. Plenty of researchers have backlogs of work they haven’t published yet. Probably they a found BANAL relative in one of their normal sampling trips, did some preliminary studies on it, and planned to publish it later once they cleaned up their data. Everyone works like this. The part of DEFUSE saying that they would only work on viruses that were 95% similar to SARS is unclear and might mean something else. It looks more like they say they’ll start with those viruses, but also do some work on novel viruses. BANAL-52 could have been one of the novel viruses. The furin cleavage site is weird, but the researchers might have done that on purpose, to make the virus easier to keep track of, or to test different furin cleavage sites. Depending on the exact BANAL-52 relative they used, it might not even be a frameshift; there’s a particular way to spell serine that would make the insertion more natural. The claims that COVID can’t be cultured in normal media are based on speculative original research by Peter and might not hold up. Peter: WIV did most of its virus-gathering in a trip to a Yunnan cave between 2010 and 2015. All those viruses have long since been processed and added to the database. There’s no sign that they made more trips to Yunnan caves, and no reason for them to keep that secret. So the idea that they might just have some new viruses they didn’t publish doesn’t hold up. But suppose they did make more trips. Given the amount of time between the DEFUSE proposal and COVID, if they kept to their normal virus-collection rate, they would have gotten about thirty new viruses. What’s the chance that one of those was BANAL-52? There are thousands of bat viruses, and BANAL-52 is so rare that it wasn’t found until well after the pandemic started and people were looking for it very hard. So the chance that one of their 30 would be BANAL-52 is low. Also, they said in DEFUSE that they planned to go back to the same Yunnan cave. But BANAL-52 was found far away from that cave, so unless it ranged over a wide area, they probably couldn’t have found it even if they got very lucky. Session 3: Closing Arguments This third debate was supposed to be about “inference”, ie how much Bayesian evidence was provided by each of the facts given so far, and how to fit them into the Rootclaim probabilistic model. I’m going to relegate my summary of the more probabilistic half to the next section of this post, and just include the closing arguments here. Saar: Peter’s case hinges on the idea that it’s very improbable that a lab leak pandemic would first show up at a wet market. But this isn’t necessarily improbable. The Huanan Seafood Market had several factors that made it a likely location for a superspreader event. It was busy, with over 10,000 visitors a day. Many of the people there (eg the 1,000 vendors) came back daily, letting them reinfect each other. It had poor ventilation, especially in the high-positivity area near the raccoon-dog stall. It had cold wet surfaces on which the virus could survive for long periods. It was indoors, which prevented UV light from killing the virus. Given a small amount of sporadic COVID going around Wuhan, it’s not surprising for the first place it started spreading en masse to be a wet market. In fact, we have several examples of this. When China was COVID Zero, there would occasionally be small outbreaks that the authorities would have to contain. Most of these were at wet markets. For example, the big COVID outbreak in Beijing started at Xinfadi Market, their local seafood market. This couldn’t be an animal spillover, because there were no raccoon-dogs or other weird wildlife there. So it must be that wet markets are natural places for superspreader events. There are several other examples, which make up about half of the total outbreaks in Zero COVID era China, plus others in Singapore and Thailand. Since COVID clusters concentrate in wet markets even when there is no animal spillover, we should accept this as a property of the virus, and not attribute any significance to the fact that this happened in Wuhan too. Peter: About 1/10,000 citizens of Wuhan was a wet market vendor. So there’s a 1/10,000 chance that the first known COVID case should be a wet market vendor by chance alone. Weibo lists the most popular places for people to check in to their network on their phones, and the wet market was the 1600th most popular place in Wuhan, meaning that if you weight locations by busy-ness, there’s a less than 1/1600 chance that the first cases would be in the wet market. Yes, the wet market is indoors, has mediocre ventilation, has repeat visitors, etc. So do thousands of other places in Wuhan, like schools, hospitals, workplaces, places of worship. The wet market isn’t special in any way. And again, it wasn’t a superspreader event! COVID spread at the same rate in the wet market as it does everywhere else: doubling once per 3.5 days. It doesn’t matter what kinds of arguments you can come up with for why the wet market should have been the perfect superspreader event location, we can look at it and see that it wasn’t. It’s an environment that spreads COVID at exactly the normal rate. Zero COVID era Chinese outbreaks were concentrated in wet markets because they received infected animal products. We know why there was an outbreak in the Xinfadi Market in Beijing: it was because the seafood stall got frozen fish from some non-Zero-COVID country, the fish had COVID particles on it, and the vendor got infected and spread it to everyone else. Something like this is true for the other Chinese wet market based outbreaks we know about it. So this makes the opposite point you think it does: wet markets start outbreaks because there are infected goods being sold there. Then the virus spreads through the wet market at a completely normal rate. Saar: The Weibo list of 1600 places bigger than the wet market is likely inaccurate, because it's based on check-in data and people don't check in to seafood markets. Most of those 1600 places aren't amenable to superspread. The 70 markets supposedly bigger than Huanan are irrelevant, because they're supermarkets, open air markets, etc. Huanan is the largest seafood market in central China, and a more likely place for the first cluster of cases to be noticed. Markets weren't a common spillover location in SARS1, so the zoonosis hypothesis hasn't "called" this event in a way that should give them a high Bayes factor. And there’s still plenty of evidence for isolated (though not super-spreading) pre-market cases. A British expatriate in Wuhan, Connor Reed, says he got sick in November, three weeks before the first wet market case. Later the hospital tested his samples and said it was COVID. Another paper reports 90 cases before the first wet market one. Peter: Connor Reed was lying. The case wasn’t reported in any peer-reviewed paper. It was reported in the tabloid The Daily Mail, months after it supposedly happened. He also told the Mail that his cat died of coronavirus too, which is rare-to-impossible. Also, to get a positive hospital test, he would have had to go to the hospital, but he was 25 years old and almost no 25-year-olds go to the hospital for coronavirus. His only evidence that it was COVID was that two months later, the hospital supposedly “notified” him that it was. The hospital never informed anyone else of this extremely surprising fact which would be the biggest scientific story of the year if true. So probably he was lying. Incidentally, he died of a drug overdose shortly after giving the Mail that story; while not all drug addicts are liars, given all the other implausibilities in his story, this certainly doesn’t make him seem more credible. And in any case, he claimed he got his case at a market “like in the media” The other 90 cases are also fake. A lab leak guy found a paper that mentioned 90 more cases than other papers, and made up a conspiracy theory where the author was trying to secretly communicate that there had been 90 secret cases before any of the confirmed cases, even though there was nothing about this in the text of the paper. But actually that paper just counted cases differently than other papers, and they were referring to normal cases after the pandemic officially started. Again, I’ll come back to the discussion about inference later, but for now, here’s a table of both sides’ reasoning. This exact presentation comparing both analyses is mine3, but you can see Saar’s version here, and Peter’s starting at 45:33 of this video. Slightly made up; the two sides didn’t express their probabilities in the same way and I had to make editorial decisions to match them. Note that these aren't entirely comparable because Peter is being laxer about out-of-model probability than Saar. Although Saar's final odds here are 533-to-1, this just the central estimate. Rootclaim’s real final probability is 94% lab leak. You can see their analysis here. And The Winner Is . . . … … … … … Peter and the zoonosis hypothesis. This was a decisive victory. There were two judges, who each gave separate verdicts (or were allowed to declare a draw). Both judges decided in favor of Peter. You can see the judges’ own summary of their reasoning here (Will, Eric) Manifold agreed with the judges. There was a prediction market on who would win. It started out 70-30 in favor of lab leak. As the videos came out, zoonosis started doing better and better. I don’t want to take the exact final numbers too seriously, since I think some of the later price increases involved hints from the participants’ behavior. But it’s clear which way viewers thought the wind was blowing4. Around the same time, the Good Judgment Project - Philip Tetlock’s group studying superforecasters - put out a report on the lab leak hypothesis. After studying it in depth, his forecasters ended up 75-25 in favor of zoonosis. The Rootclaim debate was one of ten sources they said they found especially interesting. And also around the same time, and unrelated to any of this, the Global Catastrophic Risks Institute surveyed experts (“168 virologists, infectious disease epidemiologists, and other scientists from 47 countries”) and found the same thing (though see here for some potential problems with the survey): For what it’s worth, I was close to 50-50 before the debate, and now I’m 90-10 in favor of zoonosis. III. The Math And The Aftermath The third debate session was about “inference”, how to put evidence together. I put this part off until after disclosing the winner, because I wanted to talk about some of these issues at more length. The Math: Judges Both judges included a probabilistic analysis in their written decision. Here’s the same table as above, expanded to add the judges: I shoehorned the judges’ factors into the categories I already had; some of them were actually subtly different from Peter’s, Saar’s, and each other’s. The “priors” category is especially a mess here. We’ll go over these later, but I get the impression that they both thought of probabilistic analyses as an afterthought. For example, Judge Eric wrote 30,000 words about which considerations moved him, and only then includes the analysis, saying: I am not convinced that this Bayesian calculation is even an appropriate way to estimate the relative posterior probability of Z and LL; it just seemed fair that after criticizing Rootclaim’s calculations at length I should make an attempt at it myself. Judge Will’s decision ran to 10,000 words. He said he independently tried both reasoning it out intuitively, and running the Bayesian analysis, and was relieved when these two methods returned the same result. He said: I am skeptical that the Bayesian decision making/evaluation methods are any more "objective" than [intuitive reasoning]. I think they maximize legibility, not objectivity, and tend to hide the intuitive/heuristic portion in the data inclusion step and values, where it’s harder to see . . . I am not skilled in the Bayesian method, and I am sure I made significant mistakes. More time and practice would improve and refine my estimates. At the fundamental rules of the universe level, Bayesian analysis must be the best way to evaluate evidence. However, I am unsure that it’s a good strategy for a human given our cognitive limitations, and doubly unsure it’s truly being used (in the dispassionate sense) where the outcome is social desirability/fame/Twitter likes. I’m focusing on this because Saar’s opinion is that the debate went wrong (for his side) because he didn’t realize the judges were going to use Bayesian math, they did the math wrong (because Saar hadn’t done enough work explaining how to do it right), and so they got the wrong answer. I want to discuss the math errors he thinks the judges made, but this discussion would be incomplete without mentioning that the judges themselves say the numbers were only a supplement for their intuitive reasoning. That having been said, let’s look deeper into some of Saar’s concerns. The Math: Extreme Odds Saar complained that Peter’s odds were too extreme. For example, Peter said there was only a 1/10,000 chance that a lab leak pandemic would first show up at a wet market. Peter’s argument went something like: obviously a zoonotic pandemic would start at a site selling weird animals. But a lab leak pandemic - if it didn’t start at the lab - could show up anywhere. 1/10,000 Wuhan citizens work at the wet market. So if a lab leak was going to show up somewhere random, the wet market was a 1/10,000 chance. Saar had specific arguments against this, but he also had a more general argument: you should rarely see odds like 1/10,000 outside of well-understood domains. In his blog post, he gave this example: A prosecutor shows the court a statistical analysis of which DNA markers matched the defendant and their prevalence, arriving at a 1E-9 probability they would all match a random person, implying a Bayes factor near 1E9 for guilty. But if we try to estimate p(DNA|~guilty) by truly assuming innocence, it is immediately evident how ridiculous it is to claim only 1 out of a billion innocent suspects will have a DNA match to the crime scene. There are obviously far better explanations like a lab mistake, framing, an object of the suspect being brought by someone to the scene, etc. So the real p(wet market|lab leak) isn’t the 1/10,000 chance a pandemic arising in a random place hits the wet market, but the (higher?) probability that there’s something wrong with Peter’s argument. Then Saar tried to show specific things that might be wrong with Peter’s argument. I didn’t find his specific examples convincing. But maybe the question shouldn’t be whether I agreed with him. It should be whether I’m so confident he’s wrong that I would give it 10,000-to-1 odds. This makes total sense, it’s absolutely true, and I want to be really, really careful with it. If you take this kind of reasoning too far, you can convince yourself that the sun won’t rise tomorrow morning. All you have to do is propose 100 different reasons the sunrise might not happen. For example: The sun might go nova.
Lineage A (left) was used by the Minoan Cretans, but has never been deciphered. Lineage B (right) was used by the Mycaeneans for lists of palace goods. This matches Saar’s story above. The lab leaked to somewhere else in Wuhan, not the wet market. The virus spread undetected in the population for a while. During this time, it mutated to Lineage B. Then one of the people with Lineage B went to the wet market and started a superspreader event. The authorities sampled the patients, found Lineage B, then started looking elsewhere. Later they detected some of the earlier Lineage A cases. The market is unlikely to be the origin of the pandemic, because the original Lineage A strain wasn’t found there. Peter: Although Lineage A is evolutionarily older, Lineage B started spreading in humans first. We know this because Lineage B is more common. Throughout the early pandemic, until the D614G variant drove all other strains extinct, a consistent 2/3 of the cases were B, compared to 1/3 A. Both strains spread at the same rate, so the best explanation is that B started earlier than A. Since COVID doubles every 3-4 days, probably Lineage B started 3-4 days earlier than Lineage A, which explains why it’s always been twice as many cases. But also, Lineage B also has more internal genetic diversity than Lineage A. In general, older viruses have more genetic diversity (the “molecular clock”). This is further evidence that B started spreading first. Pekar 2022 and Pipes 2021 do analyses with known parameters for spread rate and diversity, and find 90%+ odds that Lineage B was the first one in humans. Why did the older strain start spreading later? Probably the virus crossed from bats into raccoon-dogs on some raccoon-dog farm out in the country. It spread in the raccoon-dogs for a while, racking up mutations, including the (less mutated) Lineage A strain and the (slightly more mutated) Lineage B strain. Then several raccoon-dogs were taken to Wuhan for sale, including one with Lineage A and another with Lineage B. The one with Lineage B passed its virus to humans earlier. Then 3-4 days later, the Lineage A one passed its virus to humans. Lineage A was first found in a Wuhan neighborhood right next to the wet market (closer to the wet market than 97% of Wuhan’s population). Again, it would be a bizarre coincidence if a lab leak pandemic was first detected at a wet market. But it would be an even more bizarre coincidence if a lab leak pandemic separated into two strains, and both were first detected at a wet market! Although no known wet market cases were Lineage A, a positive Lineage A environmental sample was found at the wet market, and everyone agrees most cases went undetected. So maybe the Lineage B raccoon-dog spread its virus to a vendor, and that sub-strain mostly stayed in the market. But the Lineage A raccoon-dog spread its virus to a customer, who went back to his house nearby, and that strain spread in the neighborhoods next to the market. This is the only story that explains the evolutionary precedence of A, the greater spread and older molecular clock of B, and the fact that both strains were first found very close to the wet market. Yuri/Saar: Lineage B could be more common and diverse because it got the advantage of a super-spreader event in the wet market. There are a few scattered cases of intermediates between A and B, and a few other scattered cases of lineages that seem even more ancestral (ie closer to the bat virus) than either. This doesn’t make sense in a double spillover hypothesis. But it does make sense if the lineages separated in human transmission somewhere between the lab and the first super-spreader event at the wet market. Peter: Again, the wet market wasn’t a super-spreader event. COVID spread in the wet market at exactly its normal spread rate, doubling about once every 3.5 days. Stop calling the wet market a super-spreader event. The scattered cases of “intermediates” are sequencing errors. They were all found by the same computer software, which “autofills” unsequenced bases in a genome to the most plausible guess. Because Lineage B was already in the software, depending on which part of a Lineage A virus you sequenced, you might get one half or the other autofilled as Lineage B, which looked like an “intermediate”. We know this because all the supposed “intermediates” were partial cases sequenced by this particular software. We can confirm this by noting that there are too many intermediates! That is, where Lineage A is (T/C) and Lineage B is (C/T), the software found both (T/T) “intermediates” and (C/C) “intermediates”. But obviously there can only be one real intermediate form, and we have to dismiss one or the other. But in fact we can dismiss both, because they were both caused by the same software bug. The scattered “progenitor” cases - those closer to the ancestral bat virus than either A or B - are reversions, ie cases where a new mutation in the virus happened to hit an already-mutated base and shift it back towards the ancestral virus. We know this because all of these “progenitors” were scattered cases found months after the pandemic started, often in entirely different countries from Wuhan. If these were real progenitor viruses, they would have either fizzled out or exploded into a substantial portion of all cases, not be found one time in one guy in Malaysia. Given the number of mutations the virus developed over the course of the pandemic, it’s inevitable that some of them would be mutations that bring it closer to the original bat virus, and in fact we find the number of “progenitors” found very nicely matches the number of progenitor-appearing viruses we would expect by chance. And in many cases, we know the “progenitors” are newer than the original lineages, because they also have some of the later mutations that Lineage A or B picked up along the way, alongside their apparent ancestral-bat-virus-like mutations. Session 2: Viral Genetics Yuri: Two years before COVID, scientists at the Wuhan Institute of Virology, together with colleagues at the University of North Carolina, sent in a grant proposal for the DEFUSE program. This program, intended to locate and better understand potential future pandemic viruses, involved going into bat caves and collecting new coronaviruses. Once they had them, they would do gain-of-function: specifically, they would add a furin cleavage site to make them more infectious and see what happened. (quick interlude: COVID’s spike protein has two sections: one binds to human cells through the ACE2 receptor, the other helps fuse with the cell after binding. In order to avoid the immune system, it hides both of these into one spike. But when it reaches a cell, it needs to separate them again. It takes advantage of a human respiratory enzyme, furin, to do the separation - this also ensures that it only infects its primary target, human respiratory cells. The part of COVID that lets it get separated by furin is called the “furin cleavage site”. COVID’s bat-virus ancestors were gastrointestinal viruses; the addition of a furin cleavage site was what made them respiratory viruses.) We’ve found two close relatives of COVID: bat viruses called RATG-13 and BANAL-52. In particular, COVID looks more or less like BANAL-52 plus a furin cleavage site. There are 1500 sarbecoviruses, members of the family of viruses that includes SARS and SARS2/COVID. None of them except COVID have furin cleavage sites. BANAL-52, COVID’s closest ancestor, doesn’t even have anything resembling one that could mutate into a functional furin cleavage site like COVID’s. Instead, COVID - which mostly just resembles BANAL-52 with a few scattered single-point mutations - has twelve completely new nucleotides in a row - a fully formed furin cleavage site that came out of nowhere. There is nowhere else in the genome that COVID differs from BANAL-52 in such a profound way. It’s just BANAL-52 plus a little bit of random mutation plus a fully-formed furin cleavage site that came out of nowhere. Further, the furin cleavage site is weird. It uses the protein arginine twice. But instead of the nucleotides coding for arginine in the usual viral way, both times it uses the codons CGG - the way that higher animals code for arginine. This works fine - it’s just not how viruses do it. So the obvious conclusion is that WIV, which said in 2018 that it was going to find viruses and add furin cleavage sites to them, found a close relative of BANAL-52 and added a furin cleavage site. Since they were humans, and most familiar with the human way of encoding arginine, they added it as CGG both times. COVID seemed surprisingly optimized for infecting humans. Of fifty animals it was tested in, including the usual coronavirus intermediate hosts (pangolins, raccoon-dogs, etc), it was best at infecting human cells. Further, a virus that enters a new species will usually show a burst of mutations as it “figures out” the best way to adapt to that species’ unique biology. But COVID has had a pretty constant mutation rate in humans, from the beginning of the pandemic to the end. That suggests it was already adapted to humans. This could be because the lab screened for viruses with existing adaptations, because they passed it through humanized mice in the lab, or because it adapted in the hundreds of undetected cases that happened between the lab and detection in the wet market. Usually, research with potentially dangerous coronaviruses is done in BSL-3 or 4, ie high to very-high security. But WIV was irresponsibly doing it in BSL-2, ie medium security. The researchers weren’t even required to wear masks. In general, about 1/500 labs will leak any given pathogen they’re working on (?!). But because WIV was researching such an infectious virus in such an irresponsible way, the odds of a leak were much higher. The most likely explanation for all these facts is that WIV went ahead and did the gain-of-function research they said they were going to do (the particular DEFUSE grant proposal we know about got rejected, but it proves that Wuhan wanted to do this, and they could easily have gotten funding somewhere else, or done it out of their regular budget). They found a close relative of BANAL-52 and added a furin cleavage site as a simple twelve-nucleotide insertion, using the human method of encoding arginine that their genetic engineers were familiar with. Then it leaked, spread for a while in the general Wuhan population, and eventually made it to the wet market where it got detected. Peter: As mentioned earlier, the DEFUSE grant was rejected. Further, the grant said that the Wuhan Institute of Virology was responsible for finding the viruses, and the University of North Carolina would do all the gain-of-function research. This was a reasonable division of labor, since UNC was actually good at gain-of-function research, and WIV mostly wasn’t. They had done a few very simple gain-of-function projects before, but weren’t really set up for this particular proposal and were happy to leave it for their American colleagues. Even if WIV did try to create COVID, they couldn’t have. As Yuri said, COVID looks like BANAL-52 plus a furin cleavage site. But WIV didn’t have BANAL-52. It wasn’t discovered until after the COVID pandemic started, when scientists scoured the area for potential COVID relatives. WIV had a more distant COVID relative, RATG-13. But you can’t create COVID from RATG-13; they’re too different. You would need BANAL-52, or some as-yet-undiscovered extremely close relative. WIV had neither. Are we sure they had neither? Yes. Remember, WIV’s whole job was looking for new coronaviruses. They published lists of which ones they had found pretty regularly. They published their last list in mid-2019, just a few months before the pandemic. Although lab leak proponents claimed these lists showed weird discrepancies, this was just their inability to keep names consistent, and all the lists showed basically the same viruses (plus a few extra on the later ones, as they kept discovering more). The lists didn’t include BANAL-52 or any other suitable COVID relatives - only RATG-13, which isn’t close enough to work. Could they have been keeping their discovery of BANAL-52 secret? No. Pre-pandemic, there was nothing interesting about it; our understanding of virology wasn’t good enough to point this out as a potential pandemic candidate. WIV did its gain-of-function research openly and proudly (before the pandemic, gain-of-function wasn’t as unpopular as it is now) so it’s not like they wanted to keep it secret because they might gain-of-function it later. Their lists very clearly showed they had no virus they could create COVID from, and they had no reason to hide it if they did. COVID’s furin cleavage site is admittedly unusual. But it’s unusual in a way that looks natural rather than man-made. Labs don’t usually add furin cleavage sites through nucleotide insertions (they usually mutate what’s already there). On the other hand, viruses get weird insertions of 12+ nucleotides in nature. For example, HKU1 is another emergent Chinese coronavirus that caused a small outbreak of pneumonia in 2004. It had a 15 nucleotide insertion right next to its furin cleavage site. Later strains of COVID got further 12 - 15 nucleotide insertions. Plenty of flus have 12 to 15 nucleotide insertions compared to other earlier flu strains. Sometimes insertions happen because of a mistake in viral replication. Other times the virus gets confused between its own RNA and its host’s, and splices a bit of the host RNA into the virus. This would neatly explain why the insertion used the unusual coding CGG for arginine, which is common in animals but rare in viruses. On the other hand, it’s not that rare in viruses - COVID uses CGG for arginine about 3% of the time. And human engineers don’t necessarily use it any more than that - Peter was able to find one example of humans adding arginine to a virus, and 0 out of the 5 arginines added were CGG. COVID’s furin cleavage site is a mess. When humans are inserting furin cleavage sites into viruses for gain-of-function, the standard practice is RRKR, a very nice and simple furin cleavage site which works well. COVID uses PRRAR, a bizarre furin cleavage site which no human has ever used before, and which virologists expected to work poorly. They later found that an adjacent part of COVID’s genome twisted the protein in an unusual way that allowed PRRAR to be a viable furin cleavage site, but this discovery took a lot of computer power, and was only made after COVID became important. The Wuhan virologists supposedly doing gain-of-function research on COVID shouldn’t have known this would work. Why didn’t they just use the standard RRKR site, which would have worked better? Everyone thinks it works better! Even the virus eventually decided it worked better - sometime during the course of the pandemic, it mutated away from its weird PRRAR furin cleavage site towards a more normal form. Further, COVID’s furin cleavage site was inserted via what seems to be a frameshift mutation - it wasn’t a clean insertion of the amino acids that formed the site, it was an insertion of a sequence which changed the context of the surrounding nucleotides into the amino acids that formed the site. This is a pointless too-clever-by-half “flourish” that there would be no reason for a human engineer to do. But it’s exactly the kind of weird thing that happens in the random chance of evolution. COVID is hard to culture. If you culture it in most standard media or animals, it will quickly develop characteristic mutations. But the original Wuhan strains didn’t have these mutations. The only ways to culture it without mutations are in human airway cells, or (apparently) in live raccoon-dogs. Getting human airway cells requires a donor (ie someone who donates their body to science), and Wuhan had never done this before (it was one of the technologies only used at the superior North Carolina site). As for raccoon-dogs, it sure does seems suspicious that the virus is already suited to them. The claim that COVID is uniquely adapted to humans is false. The paper that claimed that defined how well COVID was adapted to different animals by those animals’ difference (on the relevant cell receptors) from humans. So in its methodology, humans came out #1 by default. If you don’t do that, COVID is better-adapted to many other animals. It’s not necessarily true that viruses see a burst of mutations when they enter a new host. COVID spread to deer and mink, and in neither case was there a burst of mutations. COVID has a pretty simple job of infecting respiratory cells and is already very good at it, regardless of species. In Yuri’s model, Wuhan Institute of Virology picked up a discarded grant and decided to do the gain-of-function half allotted to a different university, despite their relative inexperience. They skipped over all the SARS-like viruses they were supposed to work on, and all the standard gain-of-function model backbones, in favor of BANAL-52, a virus which would not be discovered for another two years, but which they somehow had samples of, which they had for some reason decided to keep secret despite its total lack of interestingness. Then they would have had to eschew all usual gain-of-function practices in favor of inserting a weird furin cleavage site that shouldn’t have worked according to the theory they had at the time, via a frameshift mutation. Then they would have had to culture it, a technique beyond their limited capabilities. Then it would have had to leak, and magically show up again in front of the raccoon-dog stall at a wet market. Yuri: WIV wouldn’t have needed to keep BANAL-52 “secret” in some kind of sinister way. Plenty of researchers have backlogs of work they haven’t published yet. Probably they a found BANAL relative in one of their normal sampling trips, did some preliminary studies on it, and planned to publish it later once they cleaned up their data. Everyone works like this. The part of DEFUSE saying that they would only work on viruses that were 95% similar to SARS is unclear and might mean something else. It looks more like they say they’ll start with those viruses, but also do some work on novel viruses. BANAL-52 could have been one of the novel viruses. The furin cleavage site is weird, but the researchers might have done that on purpose, to make the virus easier to keep track of, or to test different furin cleavage sites. Depending on the exact BANAL-52 relative they used, it might not even be a frameshift; there’s a particular way to spell serine that would make the insertion more natural. The claims that COVID can’t be cultured in normal media are based on speculative original research by Peter and might not hold up. Peter: WIV did most of its virus-gathering in a trip to a Yunnan cave between 2010 and 2015. All those viruses have long since been processed and added to the database. There’s no sign that they made more trips to Yunnan caves, and no reason for them to keep that secret. So the idea that they might just have some new viruses they didn’t publish doesn’t hold up. But suppose they did make more trips. Given the amount of time between the DEFUSE proposal and COVID, if they kept to their normal virus-collection rate, they would have gotten about thirty new viruses. What’s the chance that one of those was BANAL-52? There are thousands of bat viruses, and BANAL-52 is so rare that it wasn’t found until well after the pandemic started and people were looking for it very hard. So the chance that one of their 30 would be BANAL-52 is low. Also, they said in DEFUSE that they planned to go back to the same Yunnan cave. But BANAL-52 was found far away from that cave, so unless it ranged over a wide area, they probably couldn’t have found it even if they got very lucky. Session 3: Closing Arguments This third debate was supposed to be about “inference”, ie how much Bayesian evidence was provided by each of the facts given so far, and how to fit them into the Rootclaim probabilistic model. I’m going to relegate my summary of the more probabilistic half to the next section of this post, and just include the closing arguments here. Saar: Peter’s case hinges on the idea that it’s very improbable that a lab leak pandemic would first show up at a wet market. But this isn’t necessarily improbable. The Huanan Seafood Market had several factors that made it a likely location for a superspreader event. It was busy, with over 10,000 visitors a day. Many of the people there (eg the 1,000 vendors) came back daily, letting them reinfect each other. It had poor ventilation, especially in the high-positivity area near the raccoon-dog stall. It had cold wet surfaces on which the virus could survive for long periods. It was indoors, which prevented UV light from killing the virus. Given a small amount of sporadic COVID going around Wuhan, it’s not surprising for the first place it started spreading en masse to be a wet market. In fact, we have several examples of this. When China was COVID Zero, there would occasionally be small outbreaks that the authorities would have to contain. Most of these were at wet markets. For example, the big COVID outbreak in Beijing started at Xinfadi Market, their local seafood market. This couldn’t be an animal spillover, because there were no raccoon-dogs or other weird wildlife there. So it must be that wet markets are natural places for superspreader events. There are several other examples, which make up about half of the total outbreaks in Zero COVID era China, plus others in Singapore and Thailand. Since COVID clusters concentrate in wet markets even when there is no animal spillover, we should accept this as a property of the virus, and not attribute any significance to the fact that this happened in Wuhan too. Peter: About 1/10,000 citizens of Wuhan was a wet market vendor. So there’s a 1/10,000 chance that the first known COVID case should be a wet market vendor by chance alone. Weibo lists the most popular places for people to check in to their network on their phones, and the wet market was the 1600th most popular place in Wuhan, meaning that if you weight locations by busy-ness, there’s a less than 1/1600 chance that the first cases would be in the wet market. Yes, the wet market is indoors, has mediocre ventilation, has repeat visitors, etc. So do thousands of other places in Wuhan, like schools, hospitals, workplaces, places of worship. The wet market isn’t special in any way. And again, it wasn’t a superspreader event! COVID spread at the same rate in the wet market as it does everywhere else: doubling once per 3.5 days. It doesn’t matter what kinds of arguments you can come up with for why the wet market should have been the perfect superspreader event location, we can look at it and see that it wasn’t. It’s an environment that spreads COVID at exactly the normal rate. Zero COVID era Chinese outbreaks were concentrated in wet markets because they received infected animal products. We know why there was an outbreak in the Xinfadi Market in Beijing: it was because the seafood stall got frozen fish from some non-Zero-COVID country, the fish had COVID particles on it, and the vendor got infected and spread it to everyone else. Something like this is true for the other Chinese wet market based outbreaks we know about it. So this makes the opposite point you think it does: wet markets start outbreaks because there are infected goods being sold there. Then the virus spreads through the wet market at a completely normal rate. Saar: The Weibo list of 1600 places bigger than the wet market is likely inaccurate, because it's based on check-in data and people don't check in to seafood markets. Most of those 1600 places aren't amenable to superspread. The 70 markets supposedly bigger than Huanan are irrelevant, because they're supermarkets, open air markets, etc. Huanan is the largest seafood market in central China, and a more likely place for the first cluster of cases to be noticed. Markets weren't a common spillover location in SARS1, so the zoonosis hypothesis hasn't "called" this event in a way that should give them a high Bayes factor. And there’s still plenty of evidence for isolated (though not super-spreading) pre-market cases. A British expatriate in Wuhan, Connor Reed, says he got sick in November, three weeks before the first wet market case. Later the hospital tested his samples and said it was COVID. Another paper reports 90 cases before the first wet market one. Peter: Connor Reed was lying. The case wasn’t reported in any peer-reviewed paper. It was reported in the tabloid The Daily Mail, months after it supposedly happened. He also told the Mail that his cat died of coronavirus too, which is rare-to-impossible. Also, to get a positive hospital test, he would have had to go to the hospital, but he was 25 years old and almost no 25-year-olds go to the hospital for coronavirus. His only evidence that it was COVID was that two months later, the hospital supposedly “notified” him that it was. The hospital never informed anyone else of this extremely surprising fact which would be the biggest scientific story of the year if true. So probably he was lying. Incidentally, he died of a drug overdose shortly after giving the Mail that story; while not all drug addicts are liars, given all the other implausibilities in his story, this certainly doesn’t make him seem more credible. And in any case, he claimed he got his case at a market “like in the media” The other 90 cases are also fake. A lab leak guy found a paper that mentioned 90 more cases than other papers, and made up a conspiracy theory where the author was trying to secretly communicate that there had been 90 secret cases before any of the confirmed cases, even though there was nothing about this in the text of the paper. But actually that paper just counted cases differently than other papers, and they were referring to normal cases after the pandemic officially started. Again, I’ll come back to the discussion about inference later, but for now, here’s a table of both sides’ reasoning. This exact presentation comparing both analyses is mine3, but you can see Saar’s version here, and Peter’s starting at 45:33 of this video. Slightly made up; the two sides didn’t express their probabilities in the same way and I had to make editorial decisions to match them. Note that these aren't entirely comparable because Peter is being laxer about out-of-model probability than Saar. Although Saar's final odds here are 533-to-1, this just the central estimate. Rootclaim’s real final probability is 94% lab leak. You can see their analysis here. And The Winner Is . . . … … … … … Peter and the zoonosis hypothesis. This was a decisive victory. There were two judges, who each gave separate verdicts (or were allowed to declare a draw). Both judges decided in favor of Peter. You can see the judges’ own summary of their reasoning here (Will, Eric) Manifold agreed with the judges. There was a prediction market on who would win. It started out 70-30 in favor of lab leak. As the videos came out, zoonosis started doing better and better. I don’t want to take the exact final numbers too seriously, since I think some of the later price increases involved hints from the participants’ behavior. But it’s clear which way viewers thought the wind was blowing4. Around the same time, the Good Judgment Project - Philip Tetlock’s group studying superforecasters - put out a report on the lab leak hypothesis. After studying it in depth, his forecasters ended up 75-25 in favor of zoonosis. The Rootclaim debate was one of ten sources they said they found especially interesting. And also around the same time, and unrelated to any of this, the Global Catastrophic Risks Institute surveyed experts (“168 virologists, infectious disease epidemiologists, and other scientists from 47 countries”) and found the same thing (though see here for some potential problems with the survey): For what it’s worth, I was close to 50-50 before the debate, and now I’m 90-10 in favor of zoonosis. III. The Math And The Aftermath The third debate session was about “inference”, how to put evidence together. I put this part off until after disclosing the winner, because I wanted to talk about some of these issues at more length. The Math: Judges Both judges included a probabilistic analysis in their written decision. Here’s the same table as above, expanded to add the judges: I shoehorned the judges’ factors into the categories I already had; some of them were actually subtly different from Peter’s, Saar’s, and each other’s. The “priors” category is especially a mess here. We’ll go over these later, but I get the impression that they both thought of probabilistic analyses as an afterthought. For example, Judge Eric wrote 30,000 words about which considerations moved him, and only then includes the analysis, saying: I am not convinced that this Bayesian calculation is even an appropriate way to estimate the relative posterior probability of Z and LL; it just seemed fair that after criticizing Rootclaim’s calculations at length I should make an attempt at it myself. Judge Will’s decision ran to 10,000 words. He said he independently tried both reasoning it out intuitively, and running the Bayesian analysis, and was relieved when these two methods returned the same result. He said: I am skeptical that the Bayesian decision making/evaluation methods are any more "objective" than [intuitive reasoning]. I think they maximize legibility, not objectivity, and tend to hide the intuitive/heuristic portion in the data inclusion step and values, where it’s harder to see . . . I am not skilled in the Bayesian method, and I am sure I made significant mistakes. More time and practice would improve and refine my estimates. At the fundamental rules of the universe level, Bayesian analysis must be the best way to evaluate evidence. However, I am unsure that it’s a good strategy for a human given our cognitive limitations, and doubly unsure it’s truly being used (in the dispassionate sense) where the outcome is social desirability/fame/Twitter likes. I’m focusing on this because Saar’s opinion is that the debate went wrong (for his side) because he didn’t realize the judges were going to use Bayesian math, they did the math wrong (because Saar hadn’t done enough work explaining how to do it right), and so they got the wrong answer. I want to discuss the math errors he thinks the judges made, but this discussion would be incomplete without mentioning that the judges themselves say the numbers were only a supplement for their intuitive reasoning. That having been said, let’s look deeper into some of Saar’s concerns. The Math: Extreme Odds Saar complained that Peter’s odds were too extreme. For example, Peter said there was only a 1/10,000 chance that a lab leak pandemic would first show up at a wet market. Peter’s argument went something like: obviously a zoonotic pandemic would start at a site selling weird animals. But a lab leak pandemic - if it didn’t start at the lab - could show up anywhere. 1/10,000 Wuhan citizens work at the wet market. So if a lab leak was going to show up somewhere random, the wet market was a 1/10,000 chance. Saar had specific arguments against this, but he also had a more general argument: you should rarely see odds like 1/10,000 outside of well-understood domains. In his blog post, he gave this example: A prosecutor shows the court a statistical analysis of which DNA markers matched the defendant and their prevalence, arriving at a 1E-9 probability they would all match a random person, implying a Bayes factor near 1E9 for guilty. But if we try to estimate p(DNA|~guilty) by truly assuming innocence, it is immediately evident how ridiculous it is to claim only 1 out of a billion innocent suspects will have a DNA match to the crime scene. There are obviously far better explanations like a lab mistake, framing, an object of the suspect being brought by someone to the scene, etc. So the real p(wet market|lab leak) isn’t the 1/10,000 chance a pandemic arising in a random place hits the wet market, but the (higher?) probability that there’s something wrong with Peter’s argument. Then Saar tried to show specific things that might be wrong with Peter’s argument. I didn’t find his specific examples convincing. But maybe the question shouldn’t be whether I agreed with him. It should be whether I’m so confident he’s wrong that I would give it 10,000-to-1 odds. This makes total sense, it’s absolutely true, and I want to be really, really careful with it. If you take this kind of reasoning too far, you can convince yourself that the sun won’t rise tomorrow morning. All you have to do is propose 100 different reasons the sunrise might not happen. For example: The sun might go nova.
This alone isn’t fatal to lab leak. It’s perfectly possible for the lab to leak (let’s say) November 5th, the virus spreads a bit, and then a month later someone goes to the wet market, coughs on a vendor, and starts the officially recognized pandemic. But if that were true, you’d expect (let’s say) 30 cases by early December. Let’s say the wet market vendor was exactly Case # 30. She infected the other wet market vendors, starting a pandemic with an obvious center at the wet market and lots of infected wet market vendors and patrons. What about Case # 29? If they were (let’s say) a barista, how come they didn’t infect people at their coffee shop? How come there wasn’t a second obvious cluster radiating out from a coffee shop, lots of coffee-shop-linked cases, etc? How come there weren’t 30 equally-sized clusters? In order to avoid this, you either need to claim that the wet market was a perfect superspreader location, or that the pattern with lots of cases in the wet market and few-to-none anywhere else was a result of ascertainment bias. Saar made both those arguments during the debate, but I thought Peter rebutted them effectively. 1.4: COVID in Brazilian wastewater Nicholas Halden (blog) writes: What should we make of this study, which found the presence of covid in Brazilian wastewater in late 2019? Consider the doubling times. The study says that scientists working in late 2020 found COVID in samples of Brazilian wastewater from November 27, 2019. This was long before the first detected case of transmission in Brazil on March 13, 2020. Between November 27, 2019 and March 13, 2020 is about 16 weeks, so 32 COVID doubling times. 32 doubling times with no lockdown is enough time for COVID to infect every single person in Brazil. If COVID had infected everyone in Brazil before the first recognized case, we would have noticed. (again, COVID doubling time isn’t exactly invariably 3.5 days, but here we’re talking about numbers big enough that the exact details don’t matter very much) So if COVID was in Brazil on November 27, it must have fizzled out instead of going pandemic. How likely is that? If one person had COVID, it’s not too unlikely - not all COVID cases transmit it forward. If (let’s say) twenty people had COVID, it’s very unlikely - at that point, the law of large numbers takes over; in a freak coincidence, every single patient would have to fail to infect anyone else. So almost certainly fewer than 20 people in Brazil had COVID in November 27. So which is more likely - that somehow 20 people had COVID long before the virus was officially detected, and on a totally different continent, yet somehow a scientist looking through wastewater found the water from exactly those people and managed to detect the virus? Or that there was a sampling error, which happens all the time in these kinds of things? Peter wrote a blog post on some of these issues. He found that there were positive tests from wastewater samples as early as March 2019, which doesn’t fit anyone’s timeline, including lab leakers’. And most of these positives (including the Brazilian sample) contained later strains of the virus with mutations it picked up late in 2020. So these were almost certainly false positives from contamination. 1.5: Biorealism’s 16 arguments Biorealism has a list of sixteen arguments, which he liked so much that he posted it three times in the ACX comments, twice on Less Wrong, twice on Manifold, and about a dozen times on Twitter under multiple account names. Some posts were slightly different from others, but a typical version is: Importantly, Miller incorrectly claimed the N501Y mutation would result from passage in hACE2 mice (mixed them up with BALB/c mice). The major papers Miller relied on have been seriously challenged since the debate. See Stoyan and Chiu (2024), Weissman (2024), Bloom (2023) and Lv et al (2024). Overall the circumstantial evidence makes lab v plausible: Peter admitted getting this wrong during the debate. I think this very minor point about mice mutations was approximately his only mistake in 15 hours of debating, and he admitted it as soon as he noticed. Biorealism somehow heard about this (obviously not through watching the debate, as we’ll see in a moment), then left about 20-30 comments starting with it, under various accounts, on various platforms, as if it somehow discredited Peter. This is making me somewhat less charitable to him and his 16 arguments than I would be otherwise. 1. Chinese researchers Botao & Lei Xiao observed lab origin was likely given the nearest known relatives to SARS-CoV-2 were far from Wuhan. Wuhan Institute of Virology (WIV) sampled SARS-related bat coronaviruses where the nearest relatives are found in Yunnan, Laos and Vietnam ~1500km away. They refuse to share their records. The ancestral viruses of SARS were found equally far from where SARS spilled over into humans, so we know it’s possible (and likely) for viruses to travel that far. 2. Patrick Berche, DG at Institut Pasteur in Lille 2014-18, notes you would expect secondary outbreaks if it arose via the live animal trade. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10234839/ There are constant outbreaks of weird coronaviruses in animal handlers. See eg this paper, which estimates about 60,000 of these per year. None of these ever go anywhere, because the farmers are in rural areas that aren’t dense enough to sustain a high R0, and the epidemic fizzles out after a single digit number of cases. Any early outbreaks of COVID would have vanished into this long and mostly unnoticed list. 3. Molecular data: Only sarbecovirus with a furin cleavage site. Well adapted to human ACE2 cells. Low genetic diversity indicating a lack of prior circulation (Berche 2023). Restriction site SARS-CoV-2 BsaI/BsmBI restriction map falls neatly within the ideal range for a reverse genetics system and used previously at WIV and UNC. Ngram analysis of the codon usage per Professor Louis Nemzer https://twitter.com/BiophysicsFL/status/1667232580255490053?t=IJgitS5cw364ioclzVWxaA&s=19 The SARS2 backbone is very low in CG and CpG. While the 12-nt insert that gives it the FCS is extremely high in both. Almost as if it was some kind of chimera of a consensus sequence and a codon-optimized polybasic cleavage site? https://twitter.com/BiophysicsFL/status/1752800486837678377?t=EpIRgyybJVaPgeMP5xdstA&s=19 https://www.biorxiv.org/content/10.1101/2022.10.18.512756v1 https://link.springer.com/article/10.1007/s10311-021-01211-0?fbclid=IwAR1HMUMtLIAFOFppVasQDeoIAYrVhP8j4YoPO4wnaTOUiKLsllZl_oKryOw Most of this was discussed extensively in the second session of the debate, which I recommend. The CGG-CGG arginine codon usage is particularly unusual but used in synthetic biology. I asked a synthetic biologist about this. He said: » “Nope. I would literally never do this if I was designing a small insert (maybe I wouldn't notice if it happened by chance with ~1 in 25 odds in a naive codon optimization algorithm as part of a larger sequence). High GC% is bad. Tandem repeat is worse. Several other perfectly fine arginine codons. And I wouldn't engineer a viral genome using human codon usage. An engineer would not do it.” 4. DEFUSE full proposal: virus 20% different from SARS1, consensus seq assembled with 6 segments, without disrupting coding seq, BsmBI order, FCS. SARS2: 20% different than SARS1, 6 evenly spaced fragments w BsmBI and BsaI restriction sites, FCS. Jesse Bloom, Jack Nunberg, Robert Townley, Alexandre Hassanin have observed this workflow could have lead to SARS-CoV-2. Work often begins before funding sought or goes ahead anyway. Re: 4 - Also scattered across second section of debate, also not going to retread 5. Market cases were all lineage B. Lv et al (2024) indicates there was a single point of emergence and A came before B. So market cases not the primary cases. See also Bloom (2021), Kumar et al (2022). Peter Ben Embarek said there were likely already thousands of cases in Wuhan in December 2019.https://t.co/50kFV9zSb6 https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/34398234/ https://academic.oup.com/bioinformatics/article/38/10/2719/6553661 There was a Lineage A sample in the market, lab leak proponents just try to ignore/dismiss/conspiracize it away. The first two known Lineage A cases were very close to the market. Lv (is this even a real name? It sounds like Roman numeral? But I guess that’s what you expect in a country ruled by someone named Xi) found some weird COVID variants in Shanghai that might or might not mean anything; you can see some discussion of the implications here, but I don’t think they’re strong evidence either way. If A was first, it means some really weird stuff coincidences have to happen to give us the spread rates and genetic clock data we get, but they’re not necessarily weirder in the zoonosis hypothesis than the lab leak one. The claim that there were “thousands of cases in Wuhan in December 2019” is very easy to disprove by doubling rate arguments like the one above, by the blood bank study mentioned above, by the WHO’s failed case search, and by many other lines of argument. 6. Evidence for lineage A in the market is based on a low quality sample according to Liu et. al. (2023). I really think lab leakers need to decide whether they think China is a sinister actor trying to cover up the truth, or whether they should trust every offhand comment by Chinese government officials as gospel. Dr. Liu doesn’t explain in what sense he thinks the Lineage A sample is “low-quality”, and the Western scientists who I asked about this said they didn’t understand this complaint and that the sample was fine. A Western team re-analyzing the same sample describes it as “conclusively contain[ing] Lineage A.” I think most lab leakers have switched from trying to deny the genetics to claiming that this was “contamination”, which also doesn’t make sense (the sample is genetically very early). Note that aside from this sample, the first two Lineage A cases discovered were both very close to the wet market. 7. Bloom (2023) shows market samples do not support market origin. There is also no evidence of transmission in the claimed susceptible animals elsewhere. https://academic.oup.com/ve/advance-article/doi/10.1093/ve/vead089/7504441 Discussed extensively in my article as well as the first section of the debate. 8. Lineage A and B only two mutations apart. François Ballox, Bloom and Virginie Courtier-Orgogozo note this is unlikely to reflect two separate animal spillovers as opposed to incomplete case ascertainment of human to human transmission (Bloom 2021). Discussed extensively in my article as well as the first section of the debate. 9. Sampling bias. George Gao, Chinese CDC head at the time, acknowledged to the BBC stating they may have focused too much on and around the market and missed cases on the other side of the city. David Bahry outlines the documented bias. Michael Weissman has shown this mathematically. https://journals.asm.org/doi/10.1128/mbio.00313-23 https://academic.oup.com/jrsssa/advance-article-abstract/doi/10.1093/jrsssa/qnae021/7632556 Re: Dr. Gao, see above comment about Chinese officials. See the section Ascertainment Bias below for why I disagree with this specific claim, which also addresses the Michael Weissman argument. 10. Spatial statistics experts show the Worobey claim the market was the early epicentre was flawed. https://academic.oup.com/jrsssa/advance-article-abstract/doi/10.1093/jrsssa/qnad139/7557954 Re: 10 - See Confirmation Of The Centrality Of The Huanan Market Among Early COVID-19 Cases, a response to the paper you cite: The centrality of Wuhan's Huanan market in maps of December 2019 COVID-19 case residential locations, established by Worobey et al. (2022a), has recently been challenged by Stoyan and Chiu (2024, SC2024). SC2024 proposed a statistical test based on the premise that the measure of central tendency (hereafter, "centre") of a sample of case locations must coincide with the exact point from which local transmission began. Here we show that this premise is erroneous. SC2024 put forward two alternative centres (centroid and mode) to the centre-point which was used by Worobey et al. for some analyses, and proposed a bootstrapping method, based on their premise, to test whether a particular location is consistent with it being the point source of transmission. We show that SC2024's concerns about the use of centre-points are inconsequential, and that use of centroids for these data is inadvisable. The mode is an appropriate, even optimal, choice as centre; however, contrary to SC2024's results, we demonstrate that with proper implementation of their methods, the mode falls at the entrance of a parking lot at the market itself, and the 95% confidence region around the mode includes the market. Thus, the market cannot be rejected as central even by SC2024's overly stringent statistical test. I think this response is pretty strong. In one analysis, they show that even though the other paper’s methodology is worse than theirs, if you apply it correctly (instead of inappropriately excluding various cases like the paper’s authors did), the center of all early cases in Hubei province lands on the wet market parking lot. In another analysis, they show that the other paper’s recommended tests wouldn’t have correctly pointed to the offending water pump in the famous John Snow cholera outbreak, but theirs would have. Still, I think it’s useful to supplement fancy statistics with normal common sense, so I recommend just looking at the map of early cases: …and deciding whether you think the assumptions behind a specific statistical test are likely to debunk the idea that cases are centered around the wet market. 11. Wuhan used as a control for a 2015 serological study on SARS-related bat coronaviruses due to its urban location. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6178078/ I don’t know why this point is supposed to matter. If you mean that Wuhan isn’t directly exposed to bats, nobody ever said it was. The zoonotic theory is that wildlife carted in from other areas of China started the pandemic in the wet market. 12. Superspreader events also seen at wet markets in Beijing and Singapore (Xinfadi and Jurong). This was discussed very extensively in the debates, both in section 1 and section 3. Wet markets weren’t “superspreader locations” - in fact, the disease spread no more quickly there than anywhere else. They were the first place in those cities that the pandemic started, due to contaminated animal products. If anything, this supports zoonosis. See also my discussion with Saar on this point below. 13. WIV refuse to share their records with NIH who terminated subaward in 2022. Wider suspension over biosafety concerns. https://www.bloomberg.com/news/articles/2023-07-18/us-suspends-wuhan-institute-funds-over-covid-stonewalling Although WIV has not been especially forthcoming, some of their databases were leaked in various ways and showed that they did not have any viruses capable of transforming into COVID. 14. PLA involvement at WIV and MERS research prior to SARS-COV-2. MERS features several similarities with SARS-CoV-2. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7022351/ I can’t even tell what conspiracy theory you’re trying to propose with this one; if you spell it out I can try to explain why it might be false. 15. SARS1 leaked several times and SARS-COV-2 has leaked from a BSL-3 lab in Taiwan. Agreed that SARS leaked several times. It also spilled over from animals several times. During the debate, a lab leak rate of once per lab per 500 years was proposed (everyone agreed to steelman this by 10x for WIV numbers); I would be interested to know whether anything about the study of SARS challenges that number. 16. Unpublished infectious clone identified from Wuhan contradicting arguments such reverse genetics systems would be published. https://www.biorxiv.org/content/10.1101/2023.02.12.528210v1.full I asked some scientists about this paper and here’s what they told me. Wuhan University sequenced some rice. In the middle of the sequence, there’s an unexpected sequence from a common coronavirus, HKU4. The most likely explanation is that someone else in Wuhan was working on the coronavirus and there was cross-contamination. Plausibly this is Wuhan Institute of Virology, who is known to work with coronaviruses. This is cool detective work, but it’s not clear what it’s supposed to prove. I think some lab leakers are using it to prove that WIV can do reverse genetics, but they admitted this already in a published paper so that’s not too helpful. I think others are using it to prove WIV had “secret viruses” in their catalogue, but the rice virus wasn’t secret, it was HKU4, which is common and which WIV has already published papers about. 1.6: DrJayChou’s 7 Arguments Once again, I cannot stress enough how much better a take you might have on this debate if you watch it. “The first known case predates the market outbreak by a month” - this is not the consensus position. I cannot say for sure what Dr. Chou means by this, but I suspect he’s referring to one of the many claims to this effect that Peter effectively debunked during the debate (Connor Reed, Mr. Chen, the 92 cases, Brazil, etc).
HKU1 might also fit these criteria. It’s a coronavirus discovered in 2004 that seems to have spilled over in China and spread globally (it’s fine; it just causes yet another subtype of common cold). The exact animal reservoir has never been identified, although Wikipedia says it “likely originated from rodents”.
In May 2003, Guan et al (2003) identified SARS-CoV-like virus in animals in a live-animal market in Shenzhen, Guangdong Province, China. Guan et al (2003) also tested for antibodies among workers in the market. They note that “8 out of 20 (40%) of the wild-animal traders and 3 of 15 (20%) of those who slaughter these animals had evidence of antibody, only 1 (5%) of 20 vegetable traders was seropositive.” This suggests that the majority of the infections of the 11 people with close contact with animals were zoonotic. Among 508 animal traders, 66 (13%) tested positive for IgG antibody to SARS associated coronavirus by ELISA, while the control groups including hospital workers, Guangdong CDC workers, and healthy adults at clinic had an antibody prevalence of 1–3%.
…and they’re now the most valuable company in Europe. So they can probably eat the loss. What happens when the shortage ends? Compounding pharmacies are only allowed to do this because of a law that suspends some drug regulations during a “shortage”, ie when the drug is on the FDA’s drug shortage list. At some point, Novo Nordisk will build enough factories to meet capacity and there won’t be a shortage anymore. What then? Will the fun be over? Will GLP-1 agonists go back to costing $1,200/month again? Will most of the current users have to stop the drug and regain the lost weight? This would make tens of thousands of people really mad. I don’t know if the FDA has the guts to offend that many people. Their style is more to crush drugs before they ever come out, before anyone knows what they’re missing. During COVID, the DEA said that telemedicine was allowed to be cheap and convenient so patients could get care during lockdown. After the pandemic died down, they tried making it hard and expensive again, but so many patients protested that they backed off. The uproar we’ll get if the FDA tries to make GLP-1 drugs expensive again will make that one look like a tempest in a teapot. But Big Pharma will be even angrier if they don’t. And besides, they can’t keep the drug on their shortage list if everyone knows there’s no shortage. I really don’t know what will happen, and I don’t envy whichever FDA official is in charge of setting a policy on this. I did see one proposed solution somewhere or other (sorry if it’s yours and I’m not crediting you). Compound pharmacies are always allowed to make compounded medications for specific patients who have a “medical necessity” for a non-FDA-approved product. So in theory, you could try something like: Tell the patient to say that Ozempic causes them nausea.
MMW: For the next 20 years, news outlets will claim a new virus is the next COVID
Ron Conway is one of Newsom’s closest allies and biggest donors. In 2021, after Newsom broke his own COVID rules to go to a fancy dinner, some Californians tried to him recalled (ie got votes to hold a special election to impeach the governor). Conway (net worth $1.5 billion) helped coordinate Big Tech around opposing the recall and personally donated $200K to the anti-recall campaign; he apparently lobbied against the bill, and plausibly leads the list of people the Governor owes favors to.
The ability to replicate more and more of the functions of human intelligence on a machine is both very exciting and incredibly risky. Personally I am deeply alarmed by military applications of AI in an age of great power competition. The autonomous weapons arms race strikes me as one of the most dangerous things happening in the world today, and it’s virtually undiscussed in the press. The conceivable harms from AI are endless. If a computer can replicate the capacities of a human scientist, it will be easy for rogue actors to engineer viruses that could cause pandemics far worse than COVID. They could build bombs. They could execute massive cyberattacks. From deepfake porn to the empowerment of authoritarian governments to the possibility that badly-programmed AI will inflict some catastrophic new harm we haven’t even considered, the rapid advancement of these technologies is clearly hugely risky. That means that we are being put at risk by institutions over which we have no control.
Trying to be maximally charitable, I think he’s saying that the penalty for infecting someone with HIV was much more severe than the penalty for infecting people with other diseases, that this was a relic of the age of mass panic over AIDS, and that now that we’re panicking less we should bring the penalties back into line. But his argument style actively alienates me, focusing as it does on “reducing stigma” against AIDS patients. This brings back too many bad memories of the days when we weren’t allowed to try to prevent COVID from reaching the US, or prepare for it when it did, because that might “cause stigma” against Chinese people. I’m now permanently soured on all stigma-based arguments, and the current issue under discussion - declaring that it’s not such a big deal to intentionally give people AIDS, because if we admitted it was a big deal then that might cause “stigma” - is a perfect example of why this turns me off. I’d be more comfortable if he’d ignored the “stigma” angle and just tried to argue that the penalties were out of line with equally dangerous diseases (which I haven’t yet seen evidence about).
From “Genesis and pathogenesis of the 1918 pandemic H1N1 influenza A virus”, linked above. You may recognize the lead author - Michael Worobey has also been a leading voice on the zoonotic side of the COVID origins debate. The recent history of the flu, as far as I can tell, is: 1918: An H1N1 flu (“Spanish flu”) jumped from birds to humans in America and killed 50 million people worldwide. This replaced all older strains, so most seasonal flus during this era were H1N1. 1957: An H2N2 flu (“Asian flu”) crossed from birds to humans in China, and killed about 2 million people worldwide. It replaced the H1N1 strain, so most seasonal flus during this era were H2N2. 1968: An H3N2 flu (“Hong Kong flu”) crossed from pigs (?) to humans in Hong Kong, and killed another 2 million people worldwide. It replaced the H2N2 strain, so most seasonal flus during this era were H3N2. 1977: An H1N1 flu (“Russian flu”) leaked from a biology lab (?) in Russia (it might have been a strain from the 1940s, which the Russians were trying to make a vaccine for). It didn’t kill that many people, but it stuck around, and from then on, seasonal flus could be either H3N2 or H1N1. 2009: An H1N1 flu (“Mexican flu” until the PC police stepped in; afterwards “swine flu”) took some horrible circuitous route between birds and pigs and back again, crossed over into humans in Mexico, and killed 200,000 people. It outcompeted older strains of H1N1, but couldn’t crowd out H3N2, so seasonal flus are still either H3N2 or H1N1. …which brings us to the present, hopefully illuminating why “new flu strain crosses over from animals into humans” is such an “uh oh” moment. The Bird Flu Technically, all pandemic flus start as bird flus. Influenza A evolved in birds. Sometimes it spreads to other animals, including pigs, cattle, and humans. The most common way for a bird flu to spread to humans is to “reassort” (not exactly virus sex, but close enough, and the real version is less memorable) with a human flu virus (ie one that has already crossed over to humans). The resulting virus has all of the human flu virus’ human adaptations, but borrows enough new antigens from the bird virus to evade the immune system. Pigs can be infected by both human and bird viruses, so they are a common place for this reassortment to take place. If reassortment is sort of like viral sex, pigs are sort of like Tinder. When a bird flu and human flu reassort in pigs, the resulting disease is called a swine flu. At least the 2009 flu pandemic was a swine flu, and a minority opinion thinks the 1918 pandemic was too. There aren’t major epidemiological differences between direct-from-bird flus and swine flus. H5N1 was first noticed in birds - specifically, a flock of chickens in Scotland in 1959 - after which it disappeared for forty years. In 1996, it showed up in geese in China, then gradually increased its market share among birds worldwide. In 2022, it was found in minks; apparently it had learned to infect mammals. By early 2024, it was seen in cows. Now it’s in cow herds in 16 states, and one of them (California) has declared a state of emergency. And in October, H5N1 was found in pigs for the first time. It’s not uncommon for humans to catch an animal disease. This doesn’t mean the disease has “crossed over” to humans. If the virus isn’t suited to human-to-human transmission, it simply dies off (either before or after killing its human host). Thus, chicken farmers have been reporting scattered H5N1 cases since 1997; now that the virus has spread to cattle, cow farmers have started reporting the same. A Metaculus comment on this topic introduced me to the phrase “biocomputational surface”. Every viral replication that takes place in a human gives the virus one more chance to develop the set of mutations that makes it human-transmissible and start the next pandemic. Or, more likely, every viral replication that takes place in a human who has both the H5N1 bird flu and a normal human flu - or in a pig which has both viruses - gives the virus one extra chance to reassort in a way that produces a bird-antigen-fortified human-adapted flu virus. This doesn’t mean H5N1 will definitely become human-transmissible soon. Many viruses hang out on the borders of transmissibility for decades. Some, for unclear reasons, never cross over at all. But all of this is compatible with the virus becoming transmissible soon. So: What Is The Chance Of A Pandemic? The prediction markets on this topic ask a question about “10,000 cases in the United States”. Does this necessarily mean “pandemic”? Might it be possible to get to 10,000 cases just from the scattered chicken and cow farmers, with no human-to-human transmission? Despite many chicken and cow infections this year, there have only been 60 - 70 recorded human cases. Unless there is a phase change in screening methods, it seems hard for this number to increase to 10,000 off farmers alone. I think it’s fair to treat this question as operationalizing “what is the chance of a pandemic”? By this definition, Manifold estimates a 40% chance of an H5N1 pandemic in 2025. Metaculus estimates a 5% chance. You can see below whether that’s changed since I wrote this essay: 5% versus 40% is a big difference! Who do we trust? I trust Metaculus. Metaculus has beaten Manifold in both of the two head-to-head comparisons that I know of (Jeremiah Johnson’s and mine). Manifold’s number swings by a factor of two from week to week; Metaculus has been steady. But also, Metaculus hosts a CDC-sponsored respiratory disease forecasting tournament which has enriched them in epidemiological expertise. And if you look at the quality of comments on both sites, it’s pretty obvious where the people with more intellectual chops are hanging out. The Manifold comments are mostly single sentences, or occasionally just links to an article about new cases. The Metaculus comments look more like this one by dimaklenchin: Despite the panic propaganda, H5N1 is unlikely to be "just a single mutation away from switching host preference": 1) It normally takes a lot more than a single mutation to switch hosts. E.g., there are at least five different reasons why SIV (monkey equivalent of HIV) is not infectious to humans. Heck, a variant of SIV that bears HIV's receptor-recognizing surface protein (SHIV) is still not infectious to humans. HIV most certainly evolved from SIV but, almost as certainly, it took a very long time to get there. Not that all viruses are the same and things can't turn out differently with flu, but I don't subscribe to the idea that a mere change of receptor specificity (something that can take 1-2 mutations) will be sufficient. 2) We have data. Lots of human infections with other varieties of bird flu in the past - all those viruses ultimately went nowhere. Why would H5N1 be radically different? E.g., the "Canadian teen", despite what sounds like a prolonged exposure, failed to infect anyone around him. Since I am at 18% for the h-2-h H5N1 detection in 2025, I am arbitrarily going ~ an order of magnitude lower than that for something as unprecedented as 10K human infections. Maybe should be much lower but hedging for the time being and will allow another couple months of observations. And Sergio: I'm currently at 20% on the question of reported human-to-human transmission of highly pathogenic avian influenza H5N1 globally before 2026. However, this question is only about the US, and is more general about all subtypes of H5. But H5N1 very strongly appears to be the most important subtype to consider in this time period. And, given the current situation in the US with H5N1 human cases derived from exposure to poultry or cattle (with cattle(mammals) being more worrisome), h2h transmission seems quite more likely to arise in North America than elsewhere before 2026. Conditioning on h2h transmission in the US (and also trying to consider, with lower probability, a start in Canada), I want to estimate the chances that it becomes sustained and out of control (in which case, if it starts in Canada, I largely expect it to spread to the US). The (6) past events of probable h2h transmission of avian H5(N1), none of which were sustained, could serve as a base rate, although I'm a bit wary of giving much weight to this precedent, since the last event was quite a while ago (2007), and also because reporting and testing standards may have improved considerably since then (so perhaps they might not have been classified as h2h transmission events if they had occurred more recently). The current situation in the US, and events such as the Canadian teen who got sick with H5N1, do suggest a higher background level of risk than normal (which would be reduced if a vaccine for cattle is licensed soon), but I'm wary of overupdating. Conditioned on sustained h2h transmission, reaching over 10k cases in a few months seems likely, although perhaps very strong monitoring and surveillance could contain the situation in time (at the very least to moderate the growth rate). Trying to combine all these factors somewhat haphazardly, I'm currently at 3.5% for this question. That’s before 2026. What about longer-term? Manifold gives a ~50% chance before 2030; Metaculus uses a more complicated method but it says about 25% chance before 2030. H5N1 may cross to humans, but it could take a while. Superforecaster Juan Cambeiro at The Institute For Progress estimated a 4% chance of a “worse than COVID” H5N1 pandemic in “the next year”, but their estimate was made in 2023, without the benefit of the Metaculus estimates or most of our current knowledge. This feels high now - Metaculus says 5% total for H5N1 pandemic, and most pandemic flus are not worse than COVID. IFP also seem to be expecting a case fatality rate greater than 10%, which I find unlikely for the reasons mentioned above. I trust their estimate less than Metaculus’ current ones. I conclude that the most plausible estimate for the chance of an H5N1 pandemic in the next year is 5%. Interestingly, 5% is about the base rate for pandemic flus per year: five in the past century = one per twenty years = 5% chance per year. Isn’t it surprising that we’re still at the base rate when we can see a dangerous-looking flu virus spreading through the types of animals that have caused pandemic flus in the past? Part of the answer is that we’re not - in addition to the 5% chance of H5N1, we have to add the chance of some other pandemic flu. This probably isn’t 5% on its own; scientists monitor flu strains closely, and they haven’t found any others which are giving off as many red flags as H5N1. Still, something could always come out of left field. Maybe we should add a 2.5% chance of some other strain, for a total of 7.5% chance of a flu pandemic (ie beyond normal seasonal flu) next year. But still, isn’t it surprising that we’re so close to the base rate? One way to think about this: the base rate represents how concerned we should be if there was no epidemiological monitoring at all. In that case, we would estimate a probability distribution across different epidemiological landscapes, most of which contain some concerning-looking flu strains. Since we are doing the epidemiological monitoring, we can collapse that distribution into a single picture: one flu strain, H5N1, is in fact pretty concerning, and other strains mostly aren’t. This is enough to move our prior from 5% to 7.5%, but no more. The forecasters I talked to raised one other point of uncertainty: does the flu work more like a dice roll, or like a bus? Dice rolls are uncorrelated with their predecessors; even if it’s been a hundred rolls since you last rolled a 6, your chance this time is still 1/6. But buses come at fixed intervals; if the buses are hourly, and you haven’t seen a bus in the past 59 minutes, then your chance of seeing a bus in the next minute is very high. It’s been 16 years since the last flu pandemic; these pandemics come (on average) every 20 years. I don’t think anyone has a good sense of how to think about this. But it was 40 years between the Spanish and Hong Kong flus, so the twenty year number is at best a rule of thumb. The 5% number feels very low to me (and, apparently, to the average Manifold forecaster). Isn’t H5N1 spreading to cows and pigs and all sorts of other mammals? Isn’t it in the news all the time? I trust Metaculus a lot, but I agree that this is a surprising update, and I’m taking it on faith rather than feeling it in my bones. What Would The Fatality Rate Be For An H5N1 Pandemic? There are four basic stories you could tell about likely H5N1 mortality. First, maybe mortality would be 50%. The argument here is that official statistics report this mortality rate in the chicken farmers who have been infected with H5N1 so far. Several news sources and even some scientists have raised the specter of a pandemic version of H5N1 pandemic with this same death rate, which could kill a quarter to a third of the world population. THIS IS EXTREMELY FAKE. The official statistics only report fatality rate in the infections we know about. Bird flu is rare, there’s no mass testing, and we only learn that somebody had it if they’re in a hospital and the doctors are worried enough to test for rare conditions. Of Americans who got bird flu in the past year, 0 out of 61 have died. Probably this is mostly because America upped its detection game and is now finding milder cases; we also can’t rule out the virus mutating to become less virulent. Metaculus estimates the current true mortality rate as 1.25%. …but leaves a wide 90% confidence interval, from 0.5% to 7%. Second, maybe mortality would be somewhere around 1.25%. The argument here is that Metaculus uses this as its central estimate of US mortality. But Sentinel discusses some reasons to be skeptical of broad inferences from the US numbers: Scientists have been puzzled by the apparently low H5N1 case fatality rate in humans in the US. They offer a number of hypotheses: “The way in which the virus is being transmitted — along with the amount of virus exposure — is limiting the severity of disease.”
H5N1 may cross to humans, but it could take a while. Superforecaster Juan Cambeiro at The Institute For Progress estimated a 4% chance of a “worse than COVID” H5N1 pandemic in “the next year”, but their estimate was made in 2023, without the benefit of the Metaculus estimates or most of our current knowledge. This feels high now - Metaculus says 5% total for H5N1 pandemic, and most pandemic flus are not worse than COVID. IFP also seem to be expecting a case fatality rate greater than 10%, which I find unlikely for the reasons mentioned above. I trust their estimate less than Metaculus’ current ones. I conclude that the most plausible estimate for the chance of an H5N1 pandemic in the next year is 5%. Interestingly, 5% is about the base rate for pandemic flus per year: five in the past century = one per twenty years = 5% chance per year. Isn’t it surprising that we’re still at the base rate when we can see a dangerous-looking flu virus spreading through the types of animals that have caused pandemic flus in the past? Part of the answer is that we’re not - in addition to the 5% chance of H5N1, we have to add the chance of some other pandemic flu. This probably isn’t 5% on its own; scientists monitor flu strains closely, and they haven’t found any others which are giving off as many red flags as H5N1. Still, something could always come out of left field. Maybe we should add a 2.5% chance of some other strain, for a total of 7.5% chance of a flu pandemic (ie beyond normal seasonal flu) next year. But still, isn’t it surprising that we’re so close to the base rate? One way to think about this: the base rate represents how concerned we should be if there was no epidemiological monitoring at all. In that case, we would estimate a probability distribution across different epidemiological landscapes, most of which contain some concerning-looking flu strains. Since we are doing the epidemiological monitoring, we can collapse that distribution into a single picture: one flu strain, H5N1, is in fact pretty concerning, and other strains mostly aren’t. This is enough to move our prior from 5% to 7.5%, but no more. The forecasters I talked to raised one other point of uncertainty: does the flu work more like a dice roll, or like a bus? Dice rolls are uncorrelated with their predecessors; even if it’s been a hundred rolls since you last rolled a 6, your chance this time is still 1/6. But buses come at fixed intervals; if the buses are hourly, and you haven’t seen a bus in the past 59 minutes, then your chance of seeing a bus in the next minute is very high. It’s been 16 years since the last flu pandemic; these pandemics come (on average) every 20 years. I don’t think anyone has a good sense of how to think about this. But it was 40 years between the Spanish and Hong Kong flus, so the twenty year number is at best a rule of thumb. The 5% number feels very low to me (and, apparently, to the average Manifold forecaster). Isn’t H5N1 spreading to cows and pigs and all sorts of other mammals? Isn’t it in the news all the time? I trust Metaculus a lot, but I agree that this is a surprising update, and I’m taking it on faith rather than feeling it in my bones. What Would The Fatality Rate Be For An H5N1 Pandemic? There are four basic stories you could tell about likely H5N1 mortality. First, maybe mortality would be 50%. The argument here is that official statistics report this mortality rate in the chicken farmers who have been infected with H5N1 so far. Several news sources and even some scientists have raised the specter of a pandemic version of H5N1 pandemic with this same death rate, which could kill a quarter to a third of the world population. THIS IS EXTREMELY FAKE. The official statistics only report fatality rate in the infections we know about. Bird flu is rare, there’s no mass testing, and we only learn that somebody had it if they’re in a hospital and the doctors are worried enough to test for rare conditions. Of Americans who got bird flu in the past year, 0 out of 61 have died. Probably this is mostly because America upped its detection game and is now finding milder cases; we also can’t rule out the virus mutating to become less virulent. Metaculus estimates the current true mortality rate as 1.25%. …but leaves a wide 90% confidence interval, from 0.5% to 7%. Second, maybe mortality would be somewhere around 1.25%. The argument here is that Metaculus uses this as its central estimate of US mortality. But Sentinel discusses some reasons to be skeptical of broad inferences from the US numbers: Scientists have been puzzled by the apparently low H5N1 case fatality rate in humans in the US. They offer a number of hypotheses: “The way in which the virus is being transmitted — along with the amount of virus exposure — is limiting the severity of disease.”
I agree with this solution. 3: Ruxandra Teslo and Willy Chertman: The Case For Clinical Trial Abundance 4: This month in nominative determinism: NYT article calculating your chance of winning the lottery, by Victor Mather (h/t Yafah Edelman). 5: Someone is working on a dating site that uses your conversations with Claude to find a match. Link here, although so far it’s just a landing page where you can register interest (h/t @venturetwins) 6: The Lyttle Lytton Contest searches for the worst possible opening line for a novel; it’s been going on since 2001 and this year’s results are in. 7: Gary Marcus and Miles Brundage have made a bet about AI progress. I agree with @tamaybes and others in saying that Miles let Gary off too easily; Gary’s public statements all sound like “modern AI is mostly hype, it doesn’t really do anything like thinking”, but the bet is about things like “will AI make a Nobel Prize caliber scientific discovery by 2027?” and “will AI write Pulitzer-quality books by 2027?” I don’t blame Gary for taking the best terms he could find. But I am worried that if AI makes a Nobel-quality scientific discovery in 2026, but doesn’t quite write the Pulitzer-quality book, then Gary will get to claim victory over the AI optimists, whereas in fact that would be at probably the 95th percentile of fast timelines by most people’s estimate. 8: “The probability that cows (or other non-human animals) are experiencing constant bliss, lack tanha (craving, aversion, and the resulting suffering), or are "enlightened by default" is, by my estimation, very low”. 9: Recursive Adaptation (blog on addiction policy)’s predictions for 2025. 75% of FDA approval of GLP-1 for a substance use disorder by 2029! 10: In my post on the economics of GLP-1 receptor agonists (eg Ozempic), I wrote about how they’re currently widely available because of a loophole suspending patents during a shortage, and predicted there would be a big fight when the shortage was over. Sure enough, the FDA tried to declare that the shortage of tirzepatide (a next-generation Ozempic relative) was over, compounding pharmacies sued, and tirzepatide is still available while the issue goes through the courts (and will the administration have an opinion?) Also, compounding pharmacy access startup Mochi says that they will continue to prescribe even if the shortage is over, using another loophole saying doctors can do this for specific individual patients in cases of medical necessity. This is an extremely fake use of this loophole, but will the government be willing to call their bluff? 11: Jacob Falkovich has a blog on dating advice, which he plans to turn into a book of dating advice. I can’t really comment on the accuracy (my dating strategy tends to look more like waiting for women to send me emails saying “I like your blog, would you like to go on a date?” which probably doesn’t generalize), but I’ve had many good interactions with Jake, and he has a beautiful family which means he must be doing something right. Also, Jake is poly, and I sometimes wonder if poly people are the only ones qualified to give dating advice: if you’re monogamous, you either met your future spouse quickly (in which case you have no experience), dated for years without meeting your spouse (in which case you can’t be very good), or aren’t looking for a committed relationship at all (which is just pickup artistry, and follows very different dynamics). Poly people are the only ones who can break out of this trilemma! 12: Christ And Counterfactuals is a blog on effective altruism from a Christian perspective. Some previous attempts at this have felt kind of forced, but the first post I read here was actually pretty interesting. Richard Swinburne (apparently “the world’s best Christian philosopher”), thinks that: “[One] reason why it is good that the human race should sometimes be in an initial situation of considerable ignorance about the causes and effects of our actions, is this. If God abolished the need for rational inquiry and gave us from childhood strong true beliefs about the causes of things, that would make it too easy for us to make moral decisions. As things are in the actual world, most moral decisions are decisions taken in uncertainty about the consequences of our actions. I do not know for certain that if I smoke, I will get cancer; or that if I do not give money to some charity, people will starve. So we have to make our moral decisions on the basis of how probable it is that our actions will have various outcomes—how probable it is that I will get cancer if I continue to smoke (when I would not otherwise get cancer), or that someone will starve if I do not give. Since probabilities are so hard to assess, it is all too easy to persuade yourself that it is worth taking the chance that no harm will result from the less demanding decision (the decision which you have a strong desire to make). And even if you face up to a correct assessment of the probabilities, true dedication to the good is shown by doing the act which, although it is probably the best action, may have no good consequences at all.” (Could a Good God Permit so Much Suffering? A Debate, pp. 52-53.) This is pretty galaxy-brained, but something galaxy-brained must be going on for God to tolerate the existence of evil at all, and this is a surprisingly natural extension of some common premises on the subject. 13: Swedish study: diagnosing the marginal patient with a psychiatric condition makes their life worse. Of the two mechanisms they looked at, stigma seems more involved than drug side effects. My opinion: this study was done on conscripts undergoing a mandatory psych evaluation for the army, who had no previous reason to think they had a psych disease and had not sought treatment. This is a different situation from somebody who comes to a psychiatrist asking for relief from specific symptoms they have noticed. Also, Sweden c. 2005 is a different culture from America 2025 in terms of how much stigma a psych diagnosis carries. I think it’s possible that if you never considered that you had psychiatric problems, and were suddenly given a diagnosis in 2005 Sweden and told you couldn’t serve in the army, that’s likely to destabilize your self-image more than a person who knows they’re depressed going to a psychiatrist in 2025 US and getting antidepressants. 14: RIP Felix Hill, research scientist at DeepMind and mentor to many in the AI community. You can read his suicide note here, though the obvious content warning applies. He says he took ketamine for mild anxiety and it plunged him into an incredibly deep depression that he couldn’t get out of; he leaves his story behind as a warning for others. I appreciate his warning, but I wish he had said more about what dose he used; different people’s ketamine doses vary by almost two orders of magnitude, I’d previously thought that the low doses were pretty safe and the high doses were sketchy, and I would like to know whether I should update or not. 15: RIP Max Chiswick, professional poker player, effective altruist, and ACX reader. 16: Adrian Dittman, a Twitter account widely accused of being Elon Musk’s alt, has been revealed to be . . . a guy named Adrian Dittman. Congrats to Maia Crimew and the Spectator for actually investigating this, unlike many other news sources which spread the Musk conspiracy theory. Also, the people involved got banned from X for some reason, maybe because this qualified as doxxing Dittman. 17: Related: Musk claims to be among the top players in the world at several computer games. A veteran Path of Exile gamer presents evidence that Musk faked his PoE2 accomplishments by hiring a Chinese guy to play on his account. Some Musk supporters in the comments suggest that maybe he hires the Chinese guy to level up his account, but his accomplishments (eg speedruns) are still his own? 18: Related: Sam Harris says he has been friends with Musk since 2008, but he noticed a sudden shift for the worse in his personality around 2020 which made it impossible to stay friends with him. He gives the example of Musk losing a bet with him that there would be 35,000+ COVID cases in the US, refusing to pay up, and launching personal attacks on Sam when asked to do so. What happened? Some theories: Musk turned right-wing, which ended his friendship with Sam for the same reason political differences have always ended friendships (but then what about the bet, which seems like objectively bad behavior?)
Warp Speed for Air Quality: RFK Jr’s “Make America Healthy Again” philosophy is a horseshoe-theory-style union between typical conservative concerns about purity and typical liberal concerns about the environment. There are many ways it could go wrong, but one place it might go very right is in air quality. Recent research has highlighted the role of air quality in both chronic disease (eg particulate matter in the air causing lung problems) and infectious disease (despite the WHO’s attempts at weird language games, respiratory viruses including colds and COVID are airborne). There are some really innovative solutions (advanced air filters, UV-C technology) on the horizon, but we don’t expect this administration to want to throw a lot of money into blue-sky research. Instead, we suggest taking a page from the first Trump administration playbook and offering Operation Warp Speed style advance purchase agreements, which guarantee a market if and only if the technology works. An air quality Warp Speed could go a step further and target a result that is cost-saving to the federal government. For example, you could set the goal of reducing airborne disease in military housing by 50% by the start of 2028. Because the government pays for military healthcare, this would save costs and also create the evidence needed for private industry (workplaces, nursing homes, cruise ships, etc.) to implement air quality interventions for themselves.
Regulatory Capacity For Emergencies/Pandemics: In 2020, Makary criticized the FDA’s slow response to COVID. This time, we could try to be ahead of the game by building the FDA’s capacity and expertise in advance. During peacetime, this team could work on a universal flu vaccine and a pandemic equivalent of the START pilot pioneered by FDA biologics director Peter Marks.
Bhattacharya is a rare doctor and medical professor who also has a PhD in economics. His contrarian COVID positions provoked censorship and harassment from Big Tech and the academic establishment; the experience seems to have low-key traumatized him, and his preliminary policy proposals, listed here, focus on using the NIH's grant-giving power to shake up the orthodoxy that wanted him silenced. Here are some other policies we hope he’ll look into:
St. Felix publicly declared that he believed with 79% probability that COVID had a natural origin. He was brought before the Emperor, who threatened him with execution unless he updated to 100%. When St. Felix refused, the Emperor was impressed with his integrity, and said he would release him if he merely updated to 90%. St. Felix refused again, and the Emperor, fearing revolt, promised to release him if he merely rounded up one percentage point to 80%. St. Felix cited Tetlock’s research showing that the last digit contained useful information, refused a third time, and was crucified.
Rather than rescue this with appeals to age or some other variable making these deaths not count, I think we should think of it as a bias, fueled by two things. First, dead people can’t complain about their own deaths, so there are no sympathetic victims writing their sob stories for everyone to see2. Second, controversy sells. We fight over lockdowns, lab leaks, long COVID, and vaccines, all of which have people arguing both sides, and all of which let us feel superior to our stupid and evil enemies. But there’s no “other side” to 1.2 million deaths. Thinking about them doesn’t let you feel superior to anyone - just really sad.
Five years later, we can’t stop talking about COVID. Remember lockdowns? The conflicting guidelines about masks - don’t wear them! Wear them! Maybe wear them! School closures, remote learning, learning loss, something about teachers’ unions. That one Vox article on how worrying about COVID was anti-Chinese racism. The time Trump sort of half-suggested injecting disinfectants. Hydroxychloroquine, ivermectin, fluvoxamine, Paxlovid. Those jerks who tried to pressure you into getting vaccines, or those other jerks who wouldn’t get vaccines even though it put everyone else at risk. Anthony Fauci, Pierre Kory, Great Barrington, Tomas Pueyo, Alina Chan. Five years later, you can open up any news site and find continuing debate about all of these things.
The only thing about COVID nobody talks about anymore is the 1.2 million deaths.
It was higher. This is via census.gov, from the National Center For Health Statistics:
From the CDC, via the White House, correlation between reported COVID-19 deaths and excess deaths throughout the pandemic:
It shows that whatever the cause of the excess mortality, it was closely correlated with reported COVID-19 deaths. Although you can almost imagine a doctor not being able to tell the difference between a COVID death in the ICU and a respirator-injury death in the ICU, they could probably tell the difference between a COVID death in the ICU and a bullet through the brain in a young person who probably didn’t even have a positive COVID test.
(is CRS unusual in its low overhead, maybe because of its Catholic affiliation? I checked one of USAID’s biggest secular partners, the Johns Hopkins Program for International Education on Obstetrics and Gynecology (JHPIEGO), which despite its name operates AIDS, malaria, and COVID clinics in Africa. Its NICRA was 17% and its true overhead was 3.9%, and I couldn’t otherwise find any big difference from CRS.)
We then responded to home investigation requests in 2022 for two residents: a) one hospitalized with COVID-19 and later diagnosed with Legionnaires' disease (a type of pneumonia and leading cause of waterborne disease and deaths in the US) in Harrisonburg, VA, and b) another with Acanthamoeba keratitis (a rare eye infection) in a South Carolina town. Specifically, we packed and shipped sampling kits, probes, and instruction booklets/videos, and remotely assisted residents with measuring relevant water quality parameters, taking accurate water and biofilm swab samples, and shipping those back to our laboratory. Our team used quantitative and digital droplet PCR (qPCR/ddPCR) to test for Legionella pneumophila and Acanthamoeba bacteria. We did not find these pathogens at meaningful levels, although in at least the Harrisonburg case, the resident had followed CDC Legionella prevention guidance after a prior positive Legionella detection by increasing their water heater temperature, which could have contributed to successful remediation. The results were published in the scientific journal ACS ES&T Water. ACX funding provided partial support for the lead PhD student, supplies, analysis, and shipping costs.
Minnesota and Virginia also have legislation to enable cities to implement land value taxes. We are monitoring these efforts. There are a few other cities we are operating in. We have helped another organization prepare for a meeting in Tennessee by doing impact analysis of land value taxes in the city. We have presented to city officials in the City of South Bend who have expressed support for land value taxes. Finally, we are in conversation with a State Senator in Colorado who is a champion of land value taxes. Meanwhile, we have soft launched and developed the OpenAVMKit, which uses a unified schema to do assessment accuracy reports and automated valuation methods for any property tax data given. Valuation of land is the key binding constraint to successful implementation of land value taxes. We plan to be the leaders in this space with strong benchmarking capabilities and a repo that can enable the open-source community to make the best automated valuation methods. Along with these efforts, we have expanded the movement. We have posted to the Progress and Poverty Substack growing the subscriber base to around 5,000 subscribers. We have spoken to over 25 local advocates interested in working on land value taxes in their local communities. Yet, there is a long way to go. We need to start earning income through technical assistance contracts as our grant funding expires. We need to continue pushing for a state to implement, and we need to be prepared to tell the success story for when they do. 65: EN’s Work On Bacteriophage Therapy Our project is aimed at pioneering phage therapy in Nigeria, where limited resources/infrastructure have historically held back research in this field. Starting from the ground up, we are establishing the foundational systems needed to support a robust phage research ecosystem. So far, we’ve isolated 34 bacteriophages targeting Pseudomonas aeruginosa, an essential step toward building a comprehensive phage bank. This began with collecting a wide range of clinical Pseudomonas isolates, which we are now characterizing alongside the phages through genome sequencing and phenotypic assays including studies on phage stability across pH, temperature, and salinity ranges. Our long-term goal is to develop a phage-based hydrogel for treating diabetic wounds. On the regulatory front, we have secured approval from the Attorney General to register our nonprofit organization, the Centre for Phage Biology and Therapeutics. Additionally, we’re expanding into vaccine development; following a research stay in Prof. Roderick's lab at the University of Waterloo, we have initiated the design of a phage-based universal Salmonella vaccine aimed at covering all major serotypes—an urgent need underscored by Africa’s reliance on external vaccine sources during the COVID-19 pandemic. I have signed an MTA agreement with Roderick to use his phage-based vaccine platform patents to enable us to design vaccines against any common disease affecting us. This is only the beginning, but we are proud to be laying the scientific and institutional groundwork for homegrown phage innovation in Africa. Emergent Ventures funded EN before we did and deserves a lot of credit here also. 66: Create An Artificial Kidney For an implantable artificial kidney, the first essential component is a hemofilter designed to emulate the glomerulus. Critical requirements for this hemofilter include high permeability (to maximize flow for a given area), selectivity (specifically, the retention of albumin), and robust blood compatibility (ensuring sustained function over time). Our initial strategy focused on using negative surface charge to reduce fouling. I began by testing polyelectrolyte (PE) coatings on 24nm pore membranes featuring a negative terminal charge, similar to the glomerular barrier. These initial static tests, assessing platelet adsorption in whole blood, yielded positive outcomes for some polyelectrolytes, indicating potentially desirable blood compatibility. However, static test setups are not truly representative of dynamic in-vitro conditions and don't provide data on key parameters like permeability, fouling progression, or changes in membrane selectivity. To address these limitations, I designed and built a blood filtration setup. This system sustains human whole blood in circulation for 20 minutes, allowing us to analyze all the aforementioned parameters, as well as platelet activation markers. This has resulted in a fairly high-throughput system for evaluating any surface coating. I'm pleased to report this setup has been accepted for presentation at this year's European Society for Artificial Organs (ESAIO) conference. I am also currently working on a full manuscript, as I believe this system offers a viable way to partially replace animal experiments in our early-stage research, requiring only 1.2ml of human blood per run. Working with a PhD student (hired to support both this research and work on membrane substrates), we have continued testing these PE coatings, alongside PEG coatings, on our membranes. Here, we're finding that optimization of the coating layer is crucial. With the current PE coatings, we observe a permeability drop of about an order of magnitude compared to the base membrane, making them unsuitable for an implantable device in their present form. This is likely due to the specific nature of the initial PE layer, which we can modify. We also suspect there may be ingress of PE into the pores, meaning we're not achieving just a surface coating (our goal), but rather a very thick coating, which would explain the flux loss. Optimizing the coating process to control penetration depth is now a primary focus of my ongoing work. I am currently aiming for a flux of 20ul/min (as this is cap introduced by the protein gel layer anyway) but for it to be at this 'steady state' permeability without drop in permeability. I am also imaging the membranes after contact with SEM to see if there is indeed any platelet adsorption etc. Tugrul has the dubious honor of maybe being "the only person to climb a 4000m peak with severe kidney failure". To raise money and awareness for his artificial kidney project, he is running Climb Against Time, where he will climb 41 mountains over 4000m (13000 ft) this summer. He is looking for donors and climbing partners. 67: Add Tardigrade Genes To Human Cells The goal of this one was to make hybrid cells that are more resilient for research and certain medical applications. They report: The grant was to synthesize vectors for the expression of humanized tardigrade proteins that can be targeted to different areas of the cell. All the vectors were designed, generated, and transposed into human cells. The proteins all localize successfully (e.g. they match the designed target), with one exception (we are still working on validating it). We've done some stress testing with the trangenic cells, but haven't reached firm conclusions yet. We've further generated some multigene designs but have not yet transposed them into cells, but should shortly. We're hoping to submit a manuscript on the first round later this year. 68: Teach Forecasting To EU Policy-Makers The original project didn't work out, but our grantee (who still prefers to remain anonymous) is now working with an EU think tank pursuing the same agenda, and has been teaching forecasting workshops to policy-makers for the past two months. 69: Platform For Single-Cell Imaging They ended up unable to accept this grant and returned the money. 70: Open Source Polygenic Predictor For EA/IQ They have an update here. They think they have a predictor that can explain 12% of variance in intelligence, and they’re working on validating it and creating an easy-to-use website. 71: Improve Flu Vaccines The grant mainly funded agent based modelling to demonstrate the benefit of pre-existing immunity to pandemic influenza if and when a future pandemic occurs (academic publication will result). The original proposal was to attempt to influence the WHO influenza strain selection process. After attending WHO meetings and a global influenza conference, I believe this is not feasible. Stakeholder feedback was the potential short term negative effect on vaccine hesitancy is believed to outweigh the less tangible future benefit. Given the conservative nature of decision makers, pandemic vaccines are likely to remain research only. There are still green shoots of research into pandemic preparedness/prevention that I am continuing to work on. I'm working under the "Australians for Pandemic Prevention" brand of Good Ancestors, another group that ACX funded in 2024. 72: Scenario Analysis For Developing World Agricultural Programs In addition to the research and analysis funded by the grant, I’ve learned to code with LLMs and have built an MVP of the project. The app is being considered for further development by staff at a large international organization. 73: Further C’s Political Career C’s political career is going well, but he continues to think it wouldn’t be strategic to give more information publicly at this time. Lessons Learned I'm most impressed with our lobbying/advocacy organizations. In particular, Good Ancestors has gotten the Australian government to sign onto an international AI safety declaration, partner with various x-risk-related organizations, and (possibly) extend charity tax deductions to some EA causes that previously didn't have it - I think this on its own goes a substantial way to paying back the cost of all ACX Grants. Coalition to Modify NOTA has a kidney donation bill in front of Congress that the (very illiquid) prediction markets give a 45% chance of passing; if it works, it could save thousands of lives. The Georgists are partly responsible for bills making land value taxes slightly easier to implement in a handful of states. Good Science Project seems to have significantly improved science. Are lobbying organizations a better bet than other types of nonprofit (within the constraints of ACX Grants)? I'm not sure. It could just be that lobbyists are (naturally) better at playing themselves up and sounding successful than (for example) scientists, or that politicians are good at people-pleasing and make people feel heard and encouraged in a way that might not change overall policy later. Also, I recently talked to some grantmakers who funded a lobbying organization that superficially seems excellent, but they expressed concern it was net negative (!) by taking away oxygen and spotlight from potentially more effective orgs. So I am encouraged but wary. Animal welfare organizations were another standout success. Again, I don't know how to think about this - while I think our grantees were exceptional, there's also an issue where the scale of animal welfare challenges is so great, and work on them so neglected, that lots of organizations can save a million chickens here, or a million fish there, without particularly making a splash. On the one hand, this is exactly what effective altruism should be doing - exploring grants that are very high in linear utility even if they don't feel satisfying. On the other, they're unsatisfying - and also hard to assess retroactively. How many chickens should a good animal welfare grant save? Any realistic number will both be overwhelmingly large in absolute terms and far too small in relative terms. I'm most ambivalent about our science grants. Many of them say they are successful and can point to published papers which explain the science they did. But it's hard to judge whether anything useful has changed based on the science getting done. I know it's important to fund basic research and not just last-mile technology startups, but it's hard for a mini-grants program like this one to evaluate these kinds of abstract interventions. One disappointing result was that grants to legibly-credentialled people operating in high-status ways usually did better than betting on small scrappy startups (whether companies or nonprofits). For example, Innovate Animal Ag was in many ways overdetermined as a grantee - former Yale grad and Google engineer founder, profiled in NYT, already funded by Open Philanthropy - and they in fact did amazing work. On the other hand, there were a lot of promising ACX community members with interesting ideas who were going to turn them into startups any day now, but who ended up kind of floundering (although this also describes Manifold, one of our standout successes). One thing I still don't understand is that Innovate Animal Ag seemed to genuinely need more funding despite being legibly great and high status - does this screen off a theoretical objection that they don't provide ACX Grants with as much counterfactual impact? Am I really just mad that it would be boring to give too many grants to obviously-good things that even moron could spot as promising? Someone (I think it might be Paul Graham) once said that they were always surprised how quickly destined-to-be-successful startup founders responded to emails - sometimes within a single-digit number of minutes regardless of time of day. I used to think of this as mysterious - some sort of psychological trait? Working with these grants has made me think of it as just a straightforward fact of life: some people operate an order of magnitude faster than others. The Manifold team created something like five different novel institutions in the amount of time it's taken some other grantees to figure out a business plan; I particularly remember one time when I needed something, sent out a request to talk about it with two or three different teams, and the Manifold team had fully created the thing and were pestering me to launch a trial version before some of the other people had even gotten back to me. I take no pleasure in reporting this - I sometimes take a week or two to answer emails, and all of the predictions about my personality that this implies would be correct - but it's increasingly something that I look for and respect. A lot of the most successful grants succeeded quickly, or at least were quick to get on a promising track. Since everything takes ten times longer than people expect, only someone who moves ten times faster than people expect can get things done in a reasonable amount of time. In almost every case where I thought to myself “this is a cool idea, but I don’t know how it’s going to really pay off, as opposed to reaching a cool intermediate accomplishment and then stagnating”, this was a correct criticism, and I should have taken it more seriously. But I can’t rule out that these were good in vague and hard-to-measure ways that I should take more seriously. This one is really self-serving, but in general when people were good communicators (or even bloggers) and wowed me with the writing-composition of their application, they turned out to be a good bet. And when people were hard to understand and annoying to communicate with, even if their ideas seemed good, they were less likely to pan out. Overall Thoughts The total cost of ACX Grants, both rounds, was about $3 million. Do these outcomes represent a successful use of that amount of money? Very naively, startups originating from ACX Grants have about $50 million in value1. If ACX Grants is equivalent to a pre-seed funder, and pre-seed funders usually get ~5%, then if we were VCs we would have a portfolio worth $2.5 million. About 1/5 of ACX Grants were attempting to be market-valued startups, so if we assume the charitable portion did about as well as the startup portion, then the charity portion is “worth” $10 million. There’s some reason to expect this is too high, since much of the startup value came from one successful outlier. But there’s another reason to expect this is too low, since we were aiming at charity rather than market cap, and any actual market cap that our grantees got was an unexpected side effect. I’m treating this as a sanity check rather than as a real number. It’s harder to produce Inside View estimates, because so many of the projects either produce vague deliverables (eg a white paper that might guide future action) or intermediate results only (eg getting a government to pass AI safety regulations is good, but can’t be considered an end result unless those regulations prevent the AI apocalypse). Because we tend towards incubating charities and funding research (rather than last-mile causes like buying bednets), achieved measurable deliverables are thin on the ground. But here are things that ACX grantees have already accomplished: Improved the living/slaughter conditions of 30 million fish.
I get the impression that they made a cool COVID vaccine, didn’t overcome the regulatory/ethical/practical challenges to deployment, mostly fizzled out after COVID became less exciting, and are now a collection of small passion projects. In retrospect, I think I should have been able to predict this before the original grant, and my theory of change was overly optimistic.
This graph shows that around 9% of comments will contain at least one token indicating the comment is discussing a sensitive topic, with a range of about 6% to 14%, disregarding the very early years where small sample size made the data more variable. There wasn’t any one ‘sensitive’ token in particular which correlated exceptionally well with the rise and fall of this 6% to 14%, which implies to me that we have correctly identified a general factor of ‘willingness to discuss sensitive topics’ (or possibly that the peaks and troughs correspond to peaks and troughs in the external landscape – ie specific touchpoints and lulls in the Culture War – which would also be fine for the purpose we’re putting it to). This is an imperfect measure because it only tracks if someone is using a sensitive phrase and not whether they are using it in a heretical way (cf. ‘fifty Stalins’ here). However, I thought in the context of ACX posts the approach was probably reasonable – sensitive phrases are only likely to appear if they are being discussed a lot, and we know from the previous section that discussion depth is high both now and during the 2016 peak engagement period. It isn’t necessarily true that deep discussion implies spirited debate - some political discussions on reddit can go into the thousands of comments without anyone ever actually expressing a counter-orthodoxy view – but I think in the specific context of ACX it is reasonable, because we don’t generally have norms of expressing substanceless agreement. Hopefully, therefore, the changing ratio of socially or professionally sensitive phrases to phrases not included in my dictionary would tell us something about the willingness of the comment section to engage in potentially emotive discussions at any point in time. The relationship of occurrence of these tokens to engagement with the comment section is hard to draw clear conclusions from – although the peak does indeed look to be about 2016 or 2017 the data are noisy, and strongly affected by the choice of words to include in my dictionary. I picked the dictionary before I saw the data, but perhaps a different set of words would have given a different result, especially if I had a better way of identifying sensitive discussions around COVID (‘ivermectin’ was the only COVID-related word I could think of that became politicised in the same way ‘microaggression’ or ‘misgender’ did). Nevertheless, I would say this gives some weak support to the idea that 2016 was a turning point in SSC Commentariat free speech norms (and strong support to the idea that the start of ACX was a low point for discussion of sensitive topics) I include below a few specific sensitive phrases which I thought were interesting. Do note the different scales on each graph. Of particular interest to me is the ‘SJW’ graph, which has a really clear peak at exactly the high point of Commentariat engagement. I will return to this graph later in the review. Politeness
Complexity of thought measures show clear directional reversals on every measure except average word length (which has been steadily declining) in both 2017 and 2021. This would be great confirmation for the theory that quality declined in 2016 except you’ll notice that 2017 is a bit too late to explain that! Overall, I’d say that all four of these measures point to a change which occurred when the Commentariat moved to Substack, and two-and-a-half point to a change which occurred in 2016. To me, the ACX change is somewhat understandable – Substack has a different userbase, different UI and Scott started blogging there after nearly a year hiatus so he lost some of the momentum and norms established from SSC. The start of ACX also coincided with another wave of COVID cases, which in some countries at least will have significantly altered the ‘online-ness’ of the general population. So, I don’t think we need to look especially hard for why ACX comments are a bit different to SSC comments. I also don’t think we need to look especially hard for why the ACX comments seem gradually moving more towards looking like peak-SSC; it took three years for SSC to reach peak quality, so we could tentatively propose that there is some sort of inherent ‘bedding in’ time for new comment sections to feel out and formalise the norms they want to establish. Speculatively, perhaps Substack has a different mechanism for attracting readers to WordPress so the beginning of ACX featured a mix of SSC old guard and Substack newcomers, and it is taking some time for the community norms of the SSC old guard to assert themselves onto ACX. The Commentariat seems capable of self-diagnosing the many ways in which the ACX change might have contributed to a decline in quality. For example, Moon Moth writes: I would posit that, for all of Substack's good qualities, the commenting experience is worse here. Which may be coloring commenters' overall impressions. [Expanding on this in another comment they write] Substack comments take too long to load, especially on mobile. And on mobile, they reload and lose my place whenever I switch tabs or apps … Which makes me reluctant to do anything but skim on mobile. And teddytruther writes: I also expect that this selection effect took a huge bump from the NYT controversy, which drew people primarily interested in Woke War Punditry and not a long series of guest posts on Georgist land taxes. The change which occurred in 2016 (and very specifically April 2016) is much less understandable to me. After some thought, I’ve come up with three possible hypotheses: Scott’s writing got worse in April 2016, causing mass disengagement, which changed the makeup of the comments section
The volume of ‘Trump’ comments is absolutely massive - around 11% of all comments were about Trump in January 2017, which is greater than comments about Russia during their invasion of Ukraine (10%) and comments about COVID during the first few months of the pandemic (7%). Even a topic like SJWs, which the Commentariat really liked talking about, could only manage a peak of around 1.2% (although eg ‘gender’ peaks at 5.5% and ‘feminis*’ peaks at 3.7%). Concepts like ‘Harambe’ and ‘Wikileaks’ barely register on this scale, at 0.3% and 0.5% peaks respectively. So even though the shape of the two curves looks similar when you normalise them, it is reasonable to believe Trump could have had a significant enough impact on the comments section to dislodge forum norms, in a way Harambe did not.
Misha Gurevich, Vivian Belenky, and Rachel A, $50K, to manufacture far-UVC lamps. Far-UVC is a type of ultraviolet light that kills germs rapidly; in a room with correctly-installed far-UVC lighting, viruses and bacteria die before they can reach another host, and the spread of contagious diseases plummets. In a world where this technology reached its full potential, respiratory pandemics like flu and coronavirus would cease to occur. Until now, these lamps have been limited to a few research prototypes. Last year, an ACXG-sponsored study worked to establish that they are safe for human use; results were reassuring. The next step is to produce them at scale as a consumer product for use in schools, daycares, and houses. Misha’s company Aerolamp has an early developer’s kit lamp on sale now, and is looking to hire an industrial designer experienced in safety and compliance who can help them transition to a mass-manufacturable version. If that’s you, get in touch with them here. Misha is a personal friend and a longtime ACXG evaluator; due to conflict of interest, this grant is being covered in conjunction with an outside funder.
Maximillian Seunik, $50K, for Screwworm Free Future. The screwworm is a nasty flesh-eating parasite that infests cattle and occasionally humans. It was laboriously eliminated from the US in the 1960s, from Mexico and Central America in the 90s, and finally fought to a standstill along the defensible chokepoint of the Panama isthmus in 2006. Since then, the US has regularly dropped sterile male screwworms over Panama; these distract the females and prevent them from advancing back north. During COVID, the parasite breached the barrier; it’s now back as far as Mexico, and likely to re-enter the US soon. SFF wants to encourage the development and testing of genetic biocontrol approaches, alongside other technology, to rapidly suppress screwworm populations. If these techniques work in screwworms, they could later be applied to mosquitoes, ticks, and other pests.
David Carel, $150K, to help put air purifiers in schools. Pure air is an easy sell, but an increasing body of research suggests it may have unexpected advantages, including raising test scores in classrooms. This might just be because students with fewer respiratory diseases take fewer absences, or there might be more interesting connections between air pollution, respiratory health, focus, and achievement. Many schools bought air purifiers during COVID but forgot about them afterwards, or turned them off because they were too noisy; now they languish in closets, fully functional but unused. David wants to lobby schools to use the devices they have, and to develop quieter devices that are better suited for classrooms. If you’re a school, potential funder, or other would-be collaborator, please contact him here.
45: Andrew Snyder-Beattie on the latest advances in biodefense. Without having fully resolved the debate over the real-world utility of COVID-era masks and N95s, the next generation of masks - elastomeric respirators - seem significantly more effective, including for people not specially trained in wearing them. Also, propylene glycol vapor - ie the fog in fog machines - kills all germs. Having indoor spaces constantly enveloped in fog is a weird ask, but we might find ways to make it work for crucial infrastructure during a pandemic, and “the US already produces enough to cover all industrial and much residential floorspace.” More things I didn’t know: “In a worst-case scenario where all crops die instantly, the US has enough stockpiled food (including animal feed) to last at least 18 months.”
This is a combination of an absolute question (“how are conditions?”) and a relative question (“are they getting worse or better”), but you can disambiguate them here and get similar results. I conclude the vibes are actually bad. There is one anomaly, which is that I remember people complaining about the bad economy and the Boomers and hellworld since well before 2020 (consider the Trump and Sanders campaigns), but the official vibes didn’t crash until COVID. Is my memory faulty? The Economists’ Seemingly Rosy Statistics Here’s real median household income in the US over time (source): People today earn 33% more than they did during the Boomers’ heyday. Might this just be a few billionaires bringing the average up, while the incomes of ordinary people stagnate? No: this is median income. You’re thinking of mean income. The mean can be brought up by a few outliers; the median represents the exact most ordinary member of society. If you insist, here are the same data presented as the share of society making more than a certain threshold in inflation-adjusted dollars (source): Might cost-of-living increases have eaten all of these gains and then some? No: this is real median income, ie adjusted for inflation. Cost-of-living increases are a type of inflation, so those should be priced in. Might this just represent old people doing better, while the young are left behind? No: here are the same data disaggregated by age group (source): Young people’s incomes have increased as fast as everyone else’s. And the youth-specific unemployment rate was near historic lows until last year (some people blame the current uptick on AI, but this is too recent to have caused the vibecession): Here’s an attempt to compare generations directly. We can’t do this as a point-in-time estimate, because late-career old people will always earn more than early-career young people, but we can compare how much people made in inflation-adjusted dollars at the same ages: Just as our previous graphs imply, Millennials and Zoomers earn significantly more than Boomers did at the same age, even in inflation-adjusted dollars. So, the economists conclude, maybe it really is just vibes. We know of other cases where the public believes things are worsening even as they get better: crime rates are the classic example. But most people judge crime rates by what they hear on TV. Vanishing economic opportunity is much more personal. Can people really be wrong about something so close to their own lives? Fine, You’ve Proven The Contradiction We Already Knew About, Get To The Point Where You Solve It. We start by looking at other people’s proposed solutions. (Briefly) Declining Real Wages The term “vibecession” most strictly refers to the period 2023 - 2024 when economic indicators were up, but consumer sentiment was down. During that period, Noah Smith popularized a paper by Darren Grant arguing that this corresponded to a brief decline in real wages, even though stocks and other indicators kept rising: During COVID, the government instituted various relief programs which temporarily gave people lots of money (the spike). This caused some inflation, which temporarily lowered real (ie inflation-adjusted) wages. Then inflation calmed down and real wages started rising again - thus Noah’s post title, “The End Of The Vibecession?” With the benefit of two more years of data, we see that Noah and Darren were right about the trend: Wages never jumped back to the point where they would be if the pandemic had never happened, but they’re back to growing as fast as ever. So this could explain the mini-vibecession of 2023-2024. Still, I claim there is a broader vibecession. Young people felt closed out from opportunities before 2023, and they still feel that way. Since only the 2023-2024 period saw falling real wages, this can’t be the full explanation. The Housing Theory Of Everything John Burn-Murdoch, after examining some of these same data, agrees that wages can’t be the full story. He writes: Are millennials wrong to complain? I fear not. The per capita measure is a beautifully simple rejoinder, but it misses one crucial detail. Wealth accumulation — just like income — matters primarily to millennials today as a means to home ownership, especially as we move into an era of high interest rates. If we deflate wealth by the index of house prices instead of the CPI, millennials’ assets only go about half as far as boomers’ once did. We’re left with a smaller millennial deficit than the original chart implied, but a deficit nonetheless. The YIMBYs at Works In Progress go further, and present The Housing Theory Of Everything (or at least of everything bad): Try listing every problem the Western world has at the moment. Along with Covid, you might include slow growth, climate change, poor health, financial instability, economic inequality, and falling fertility. These longer-term trends contribute to a sense of malaise that many of us feel about our societies. They may seem loosely related, but there is one big thing that makes them all worse. That thing is a shortage of housing: too few homes being built where people want to live. And if we fix those shortages, we will help to solve many of the other, seemingly unrelated problems that we face as well. Here is the Case-Shiller index, the standard measure of US home prices. I’ve started it in 1985 to match our other graphs: If I were designing an index to present the case that capitalism had not failed, I would have avoided naming it “Case Shiller”. During this time, average home price has approximately doubled. Might this only reflect falling interest rates? That is, suppose people can only afford a certain level of monthly mortgage payment. When interest rates are high, that mortgage payment would correspond to a cheap house; when they are low, that same person willing to spend that same amount could buy a more expensive house. To really work with this, we need average mortgage payment over time. Kevin Drum has this up to 2020: …but it matters a lot whether this that spike at the end is a temporary pandemic effect or a permanent regime change. I’ve tried to calculate an updated version from FRED data: Average monthly payment in 1985 dollars. Going to tell my bank I’m paying my mortgage in 1985 dollars from now on. This matches Drum’s data enough to build confidence, and it shows that the post-pandemic spike has lasted. Mortgage payments are almost twice as high as in the 2010s. The COVID housing spike was partly a function of lockdown locking people in their houses (meaning that having a nice house was more important), and partly a function of the government cutting mortgage rates to alleviate lockdown-related economic distress. But why did it last even after COVID lockdowns ended? Partly because the homebuyers who bought houses during COVID will never move again, because that would mean giving up their great mortgages.
Just as our previous graphs imply, Millennials and Zoomers earn significantly more than Boomers did at the same age, even in inflation-adjusted dollars. So, the economists conclude, maybe it really is just vibes. We know of other cases where the public believes things are worsening even as they get better: crime rates are the classic example. But most people judge crime rates by what they hear on TV. Vanishing economic opportunity is much more personal. Can people really be wrong about something so close to their own lives? Fine, You’ve Proven The Contradiction We Already Knew About, Get To The Point Where You Solve It. We start by looking at other people’s proposed solutions. (Briefly) Declining Real Wages The term “vibecession” most strictly refers to the period 2023 - 2024 when economic indicators were up, but consumer sentiment was down. During that period, Noah Smith popularized a paper by Darren Grant arguing that this corresponded to a brief decline in real wages, even though stocks and other indicators kept rising: During COVID, the government instituted various relief programs which temporarily gave people lots of money (the spike). This caused some inflation, which temporarily lowered real (ie inflation-adjusted) wages. Then inflation calmed down and real wages started rising again - thus Noah’s post title, “The End Of The Vibecession?” With the benefit of two more years of data, we see that Noah and Darren were right about the trend: Wages never jumped back to the point where they would be if the pandemic had never happened, but they’re back to growing as fast as ever. So this could explain the mini-vibecession of 2023-2024. Still, I claim there is a broader vibecession. Young people felt closed out from opportunities before 2023, and they still feel that way. Since only the 2023-2024 period saw falling real wages, this can’t be the full explanation. The Housing Theory Of Everything John Burn-Murdoch, after examining some of these same data, agrees that wages can’t be the full story. He writes: Are millennials wrong to complain? I fear not. The per capita measure is a beautifully simple rejoinder, but it misses one crucial detail. Wealth accumulation — just like income — matters primarily to millennials today as a means to home ownership, especially as we move into an era of high interest rates. If we deflate wealth by the index of house prices instead of the CPI, millennials’ assets only go about half as far as boomers’ once did. We’re left with a smaller millennial deficit than the original chart implied, but a deficit nonetheless. The YIMBYs at Works In Progress go further, and present The Housing Theory Of Everything (or at least of everything bad): Try listing every problem the Western world has at the moment. Along with Covid, you might include slow growth, climate change, poor health, financial instability, economic inequality, and falling fertility. These longer-term trends contribute to a sense of malaise that many of us feel about our societies. They may seem loosely related, but there is one big thing that makes them all worse. That thing is a shortage of housing: too few homes being built where people want to live. And if we fix those shortages, we will help to solve many of the other, seemingly unrelated problems that we face as well. Here is the Case-Shiller index, the standard measure of US home prices. I’ve started it in 1985 to match our other graphs: If I were designing an index to present the case that capitalism had not failed, I would have avoided naming it “Case Shiller”. During this time, average home price has approximately doubled. Might this only reflect falling interest rates? That is, suppose people can only afford a certain level of monthly mortgage payment. When interest rates are high, that mortgage payment would correspond to a cheap house; when they are low, that same person willing to spend that same amount could buy a more expensive house. To really work with this, we need average mortgage payment over time. Kevin Drum has this up to 2020: …but it matters a lot whether this that spike at the end is a temporary pandemic effect or a permanent regime change. I’ve tried to calculate an updated version from FRED data: Average monthly payment in 1985 dollars. Going to tell my bank I’m paying my mortgage in 1985 dollars from now on. This matches Drum’s data enough to build confidence, and it shows that the post-pandemic spike has lasted. Mortgage payments are almost twice as high as in the 2010s. The COVID housing spike was partly a function of lockdown locking people in their houses (meaning that having a nice house was more important), and partly a function of the government cutting mortgage rates to alleviate lockdown-related economic distress. But why did it last even after COVID lockdowns ended? Partly because the homebuyers who bought houses during COVID will never move again, because that would mean giving up their great mortgages.
Wages never jumped back to the point where they would be if the pandemic had never happened, but they’re back to growing as fast as ever. So this could explain the mini-vibecession of 2023-2024. Still, I claim there is a broader vibecession. Young people felt closed out from opportunities before 2023, and they still feel that way. Since only the 2023-2024 period saw falling real wages, this can’t be the full explanation. The Housing Theory Of Everything John Burn-Murdoch, after examining some of these same data, agrees that wages can’t be the full story. He writes: Are millennials wrong to complain? I fear not. The per capita measure is a beautifully simple rejoinder, but it misses one crucial detail. Wealth accumulation — just like income — matters primarily to millennials today as a means to home ownership, especially as we move into an era of high interest rates. If we deflate wealth by the index of house prices instead of the CPI, millennials’ assets only go about half as far as boomers’ once did. We’re left with a smaller millennial deficit than the original chart implied, but a deficit nonetheless. The YIMBYs at Works In Progress go further, and present The Housing Theory Of Everything (or at least of everything bad): Try listing every problem the Western world has at the moment. Along with Covid, you might include slow growth, climate change, poor health, financial instability, economic inequality, and falling fertility. These longer-term trends contribute to a sense of malaise that many of us feel about our societies. They may seem loosely related, but there is one big thing that makes them all worse. That thing is a shortage of housing: too few homes being built where people want to live. And if we fix those shortages, we will help to solve many of the other, seemingly unrelated problems that we face as well. Here is the Case-Shiller index, the standard measure of US home prices. I’ve started it in 1985 to match our other graphs: If I were designing an index to present the case that capitalism had not failed, I would have avoided naming it “Case Shiller”. During this time, average home price has approximately doubled. Might this only reflect falling interest rates? That is, suppose people can only afford a certain level of monthly mortgage payment. When interest rates are high, that mortgage payment would correspond to a cheap house; when they are low, that same person willing to spend that same amount could buy a more expensive house. To really work with this, we need average mortgage payment over time. Kevin Drum has this up to 2020: …but it matters a lot whether this that spike at the end is a temporary pandemic effect or a permanent regime change. I’ve tried to calculate an updated version from FRED data: Average monthly payment in 1985 dollars. Going to tell my bank I’m paying my mortgage in 1985 dollars from now on. This matches Drum’s data enough to build confidence, and it shows that the post-pandemic spike has lasted. Mortgage payments are almost twice as high as in the 2010s. The COVID housing spike was partly a function of lockdown locking people in their houses (meaning that having a nice house was more important), and partly a function of the government cutting mortgage rates to alleviate lockdown-related economic distress. But why did it last even after COVID lockdowns ended? Partly because the homebuyers who bought houses during COVID will never move again, because that would mean giving up their great mortgages.
If this is to be taken seriously, AI is already a bigger political issue than abortion, climate change, or the environment. I fail my 2023 prediction that there was only a 20% chance this would happen by 2028. 25: Related: Bernie Sanders in The Guardian: “There is a very real fear that, in the not-so-distant future, a super-intelligent AI could replace humans in controlling the planet.” The Left has a complicated relationship with existential risk from AI: they really hate AI, which in theory should push them towards yet another reason to be against it. But they hate AI so much that they need to believe every negative thing about it at the same time, and one of those negative things is that it’s just a scam and will never work, and this naturally pushes against being concerned about x-risk. But as AI improves, will the “just a scam” position become less tenable, shunting the associated psychic energy into other reasons to hate AI (including x-risk concerns)? 26: Qualia Research Institute has released a video describing some of the work they’ve been doing the past year - The Oscilleditor: An Algorithmic Breakthrough for Psychedelic Visual Replication (1080p•⚠️SEIZURE): 27: Jesse Arm (X): “A majority of American rabbinical students are now women. Most are also LGBTQ. That includes Modern Orthodoxy. Remove Modern Orthodoxy and the numbers climb even higher.” Clergy have always served as spiritual counselors; as religions liberalize and other roles become less important, the therapist role starts to predominate. But 75% of therapists in the US are female; at the limit of liberalization where clergyman = therapist, we should expect the same gender ratio. 28: The latest news on the COVID origins debate: scientists find a naturally-occuring bat coronavirus with a COVID-like furin cleavage site. This is a point in favor of the natural origins hypothesis, since the second-best argument for lab leak was that COVID’s furin cleavage site was too strange to evolve naturally. But I think arguments that lab leak has “fallen apart” are premature: the best argument (COVID emerged only a few miles from the biggest coronavirus gain-of-function lab in the Eastern Hemisphere) remains strong. I update from something like 95% chance it’s natural to something like 96%, but not 99.99% or anything. And here’s a lab leaker arguing that COVID’s furin cleavage site is out-of-frame and so still more unnatural-looking than the one on the recently-discovered bat virus. 29: Nicholas Decker (econ blogger, famous for his controversial autistic takes and Secret Service visit) has a dating doc. Most interesting section is the one about children: he wants to have them, but doesn’t think they should be genetically related to him. From here: If this appeals to you, you can find his contact info on the document. Related: Governor Jared Polis of Colorado is a fan of Nicholas Decker and Richard Hanania. 30: Matt Yglesias comes out as aphantasic (unable to see images in his “mind’s eye”). He says that contra the usual perspective that frames this as a deficit, he finds it helpful. For example, once he got assaulted, and he remembers on an intellectual level that it happened, but since “I wasn’t taking pictures of myself getting kicked in the head so, as far as I’m concerned, it’s like it happened to someone else” (Matt usually has good instincts, so I’m surprised he uses an example which will be such catnip to his conservative critics). He thinks it makes him a better reasoner / statistics blogger / effective altruist to be able to “get a statistically valid view of the situation, not overindex on the happenstance of your life.” For what it’s worth, I’ll give my contrary data point - I think of myself as a reasoner / statistics blogger / effective altruist in a pretty similar vein as Matt, but AFAICT my visual imagination is totally normal; if other people are having their emotions yanked around by vivid images, that’s a skill issue. 31: Lakshya Jain in The Argument: The COVID political backlash [to the Democratic Party] has disappeared. Despite the narrative, polls show that voters don’t favor or disfavor either party over COVID, mostly still think school closures were necessary, and are about evenly split on vaccine mandates. I guess I can’t disagree with this poll - it seems well-done - but I still wonder whether something is being missed. Maybe it didn’t make the ~50% of voters who are naturally liberal desert the cause, but it energized conservatives in a way that might otherwise not have happened? Related, from Rob Wiblin on X, on balance Britons think the government response to COVID was not strict enough. 32: Related: Back when neoreaction was a big deal, I occasionally discussed posts by neoreactionary blogger Spandrell of Bloody Shovel. If you’re wondering what happened to him, you can read his 2024 Post-Mortem Of Neoreaction here, where he discusses how he fell out of love with the movement (warning: he has not fallen out of love with racial slurs). As a former fascist sympathizer, I can see why [fascism is on the downswing]. The allure of fascism in 2024 is much, much diminished. For a few reasons. A big one was COVID. See, the point of fascism is that Collective Action is necessary to have nice things. We need a strong government committed to the good of the people. Yarvin showed his preference early when he started his new Substack by quoting Cicero’s phrase “Salus populi suprema lex”. The health of the people is the most important law. Cicero wasn’t a fascist of course, nor is Yarvin really; a big point of fascism is to narrowly define the populus as an ethnic group with demonstrable ties to blood. That makes the government’s ties to the people stronger, increasing their commitment to do Good Collective Action. Which is important. Very important. A lot of good things can come of intelligently done Collective Action. Fascist Italy made the trains run on time. Nazi Germany fixed the terrible Weimar economy. East Asian countries are all effectively fascist states, if with less ideological baggage (yellows just aren’t like that), and they are all nice, clean, safe places with healthy economies. Fascism is not a panacea but it works, when you let it. Strong government can be pretty neat. So why is strong government less appealing these days? Well, COVID happened. And our governments were pretty damn strong in dealing with it. They made strong laws and enforced them. And what did they do with their power? Absolutely retarded shit. They destroyed the world economy and made 95% of people completely miserable for 18 months. Up to 3 long years in some places. Again, as an Orient enjoyer I was very sympathetic of strong effective government. My life has been pretty cozy thanks to it for the past decades. But after seeing boomers, hypochondriacs, and menopausal women take the reins and use it against healthy people, I’m fucking done with strong effective government. Fuck that shit, I’m out. I don’t want to see strong effective government ever again. I was very lucky that I was out of China in November 2019. It was a fluke really. I moved to the Golden Triangle after that and the law of the jungle was much, much nicer during the Doctors Plague of 2020-2022. But I spent a few months in Europe during the time and man, that was brutal. Not just seeing how retarded governments were; the level of compliance by the people was so disheartening. Imagine being a sincere fascist and seeing your people behave like that. These are my people? My Volk? Am I supposed to sacrifice life and limb for the salus of this populus? Fuck that. Let them cook, they deserve everything that’s coming to them [...] Is there a way to make the body healthy again? I do think so. I think there’s still place for a successor right wing ideology which is neither Christian fundamentalism or robot worship. And it will happen; but it won’t happen on Twitter. Maybe it can happen on Urbit, or right here in this site. I have some ideas myself, and I invite you to join me and build this together. It would be funny if the solution to the paradox Jain highlights was that for every time a COVID lockdown turned a liberal into a conservative, it turned one fascist into a moderate, for a net rightward shift of zero. 33: Also from an Argument poll: In a hypothetical Presidential matchup, Gavin Newsom beats JD Vance 54-46. I’m split between the usual heuristic of ignoring any polling more than a year before an election, and the fact that this is a remarkably big lead for polarized 21st century America. 34: Jerl wades into the David Hume on miracles debate. 35: AI Teddy Bears: A Brief Investigation. The good news is that your child’s AI teddy bear is hard to jailbreak and probably will not tell them where to find guns: The other good news is that somehow they don’t charge a subscription, which makes them a way to get usually-subscription-only AI models for free. How is this possible? “[The most likely hypothesis is that] Witpaw is an adorable piece of spyware and he’s selling my data to the CCP”. 36: This month’s anti-people-named-Sacks content: NYT on Trump AI czar David Sacks’ conflicts of interest; New Yorker on whether neurologist Oliver Sacks used his case studies to work through his own issues rather than presenting them accurately. [EDITED TO ADD: I originally framed it this way as a joke, but on further research I think David and Oliver are related. Wikipedia says that Oliver was first cousins with Israel statesman Abba Eban, and that Abba Eban was born to Lithuanian Jewish parents in Cape Town. David Sacks’ bio says he was born to Jewish parents in Cape Town, and this article specifies that they were Lithuanian. I doubt there were too many Lithuanian Jewish families named Sacks in mid-1900s Cape Town, so sure, related!) 37: Orca Sciences: There Has To Be A Better Way To Make Titanium. Titanium is a great metal - strong, light, and tough. If we had cheap titanium, it could revolutionize manufacturing the way cheap steel and aluminum did in previous eras. So why don’t we? Not because titanium is rare: it’s “the 9th most common element in the earth’s crust”. Rather, it’s very complicated and expensive to extract from its ore. Some kind of breakthrough in titanium extraction processes always seems tantalizingly close, but has never quite materialized. Is there any hope? 38: If Asians Are Lactose Intolerant, Why All The Milk Tea? Lactose intolerance has confused me for a long time - 23andMe tells me that I’m lactose intolerant, but I drink milk regularly without problems, so what’s up? This post’s answer: lactose-intolerant people who don’t usually drink milk will get sick if they start suddenly. Lactose-intolerant people who drink milk regularly since childhood develop gut microbiota that can digest milk, but which demand an expensive “tax” in calories. Lactose-tolerant people will always be able to digest milk and absorb all the calories themselves. 39: How do different majors change college students’ political beliefs? No surprise that the humanities and social sciences shift people left; no surprise that business and economics shift them right. I was a little surprised that engineering shifts people right a little, and that Education of all things shifts people right (albeit only slightly). How is that even possible? Are these people coming in as Mao Zedong and leaving as “only” Leon Trotsky? Also, Political Science is exactly neutral, lol. [EDIT: I misunderstood, they’re using natural sciences as a zero point, this is a reasonable choice but slightly changes the interpretation] 40: Kindkristin: Language models improved my mental health. 41: More floor employment, from the WSJ (h/t @LaocoonofTroy): Big Paychecks Can’t Woo Enough Sailors For America’s Commercial Fleet: “Straight out of college, graduates from the country’s maritime academies can earn more than $200,000 as a commercial sailor, with free food and private accommodations... Despite the pay and perks, maritime jobs go begging, and it is raising national-security concerns.” Other selling points include “six months vacation, live wherever you want, and you’re serving the nation” and onboard “gyms, connectivity, and cuisine”. The catch is that you have to be at sea for months at a time. 42: Study (h/t @KierkegaardEmil): there was minimal “learning loss” from COVID school closures, best estimate is “0.02 standard deviations per 100 days of school closure”. I correctly predicted this back in 2021, but I also wrote in March of this year about how there’s been a general decline in NAEP scores since then. It seems like maybe a student having their specific school closed for longer than other schools didn’t hurt them, but some sort of general cultural change, maybe related to COVID, did hurt. 43: Sam Bankman-Fried’s mother on why she thinks his trial was unfair. SBF is appealing his conviction and will probably be making some of these same points in court. Can’t find a prediction market directly on the appeal, but this one says only 15% chance he serves under 10 years, this one says 15% chance of a Trump pardon, so it doesn’t seem like there’s much room for him to be freed (or get a significantly shorter sentence) on appeal. And Wired says that only 5-10% of appeals like these succeed. 44: Related: Trump pardons Juan Orlando Hernandez, former Honduran president extradited to the US for narco-corruption. Some sources are trying to find a Prospera angle - Prospera and other ZEDEs were approved under JOH’s administration, and the Prosperans seem to have good MAGAworld connections - but I don’t think this is their top priority, and I don’t know if it requires much explanation for Trump to be pro-right-wing Latin American politicians convicted by the Biden administration. More interesting is that apparently JOH and SBF were cellmates (X), “SBF spent extensive time helping JOH with trial prep” and SBF told an interviewer that “Juan Orlando is the most innocent prisoner I’ve met, myself included.” ChatGPT is not impressed with the Trump/SBF case for JOH’s innocence. Related: JOH’s conservative party on track to win this month’s extremely-close Honduran elections, great news for Prospera if it happens. 45: The “100 Above The Park” building in St Louis (h/t Bobby Fijan on X): 46: The death toll of the ongoing Sudan genocide has risen to about 150,000. Nicholas Kristof writes that the world has once again failed to prevent atrocities, and argues that the most important point of leverage is pressure on the United Arab Emirates, which is arming the genociders. Sam Kriss also writes about the situation in The World’s First Matcha Labubu Genocide, but is unimpressed with Kristof’s take: Sudan is passed over in a deeply uncomfortable silence. The absolute most you can do is blame the Emiratis. From what I’ve seen, more people seem to be appalled at the UAE for its frankly marginal role in arming the RSF than at the RSF itself. This is the approved way of understanding any inscrutably indigenous foreign conflict: you just worm out any third-party involvement and then act like you’ve solved the whole thing. I side with Kristof here, for reasons that Sam himself touches on later in his piece, in a section comparing Darfur with Gaza. It would be very easy to make people care about Darfur again. All it would take is a loud, vocal contingent of RSF apologists in the Western media. I agree, but would frame it less cynically: the reason Westerners pay attention to Gaza is that there’s a lever to push: not only does America support Israel, but many of their friends support Israel, so they can imagine convincing America or at least their friends to stop, and at least feel like there is some remote chance of making a small difference (and in fact, Trump getting mad at Israel and deciding to pressure them was decisive in effecting the cease-fire). On the other hand, we don’t have many levers to affect ethnic Baggara in the Rapid Support Forces of Sudan, so it doesn’t really feel useful to write blog posts arguing that they should stop; obviously they should stop, nobody disagrees with this, and it goes without saying - so nobody says it. But the US does support the UAE, and many of our friends like the UAE or at least go there on vacation, so maybe it’s possible to have make some small difference by embarrassing them. 4D chess take is that Sam Kriss agrees with all of this, but “loudly” and “vocally” argued against it to give people like me a hook to write about this genocide with, in which case I thank him for his sacrifice. It would also be nice to be able to donate, but I don’t know who to trust in the region - other than Doctors Without Borders, who are usually pretty good. 47: The AI Futures Project (group of AI-will-be-fast intellectuals) and the AI As A Normal Technology team (group of AI-will-be-slow intellectuals) wrote an adversarial collaboration in Asterisk explaining what they agree on, for example: That there’s an important distinction between existing AI and “strong AGI”
No surprise that the humanities and social sciences shift people left; no surprise that business and economics shift them right. I was a little surprised that engineering shifts people right a little, and that Education of all things shifts people right (albeit only slightly). How is that even possible? Are these people coming in as Mao Zedong and leaving as “only” Leon Trotsky? Also, Political Science is exactly neutral, lol. [EDIT: I misunderstood, they’re using natural sciences as a zero point, this is a reasonable choice but slightly changes the interpretation] 40: Kindkristin: Language models improved my mental health. 41: More floor employment, from the WSJ (h/t @LaocoonofTroy): Big Paychecks Can’t Woo Enough Sailors For America’s Commercial Fleet: “Straight out of college, graduates from the country’s maritime academies can earn more than $200,000 as a commercial sailor, with free food and private accommodations... Despite the pay and perks, maritime jobs go begging, and it is raising national-security concerns.” Other selling points include “six months vacation, live wherever you want, and you’re serving the nation” and onboard “gyms, connectivity, and cuisine”. The catch is that you have to be at sea for months at a time. 42: Study (h/t @KierkegaardEmil): there was minimal “learning loss” from COVID school closures, best estimate is “0.02 standard deviations per 100 days of school closure”. I correctly predicted this back in 2021, but I also wrote in March of this year about how there’s been a general decline in NAEP scores since then. It seems like maybe a student having their specific school closed for longer than other schools didn’t hurt them, but some sort of general cultural change, maybe related to COVID, did hurt. 43: Sam Bankman-Fried’s mother on why she thinks his trial was unfair. SBF is appealing his conviction and will probably be making some of these same points in court. Can’t find a prediction market directly on the appeal, but this one says only 15% chance he serves under 10 years, this one says 15% chance of a Trump pardon, so it doesn’t seem like there’s much room for him to be freed (or get a significantly shorter sentence) on appeal. And Wired says that only 5-10% of appeals like these succeed. 44: Related: Trump pardons Juan Orlando Hernandez, former Honduran president extradited to the US for narco-corruption. Some sources are trying to find a Prospera angle - Prospera and other ZEDEs were approved under JOH’s administration, and the Prosperans seem to have good MAGAworld connections - but I don’t think this is their top priority, and I don’t know if it requires much explanation for Trump to be pro-right-wing Latin American politicians convicted by the Biden administration. More interesting is that apparently JOH and SBF were cellmates (X), “SBF spent extensive time helping JOH with trial prep” and SBF told an interviewer that “Juan Orlando is the most innocent prisoner I’ve met, myself included.” ChatGPT is not impressed with the Trump/SBF case for JOH’s innocence. Related: JOH’s conservative party on track to win this month’s extremely-close Honduran elections, great news for Prospera if it happens. 45: The “100 Above The Park” building in St Louis (h/t Bobby Fijan on X): 46: The death toll of the ongoing Sudan genocide has risen to about 150,000. Nicholas Kristof writes that the world has once again failed to prevent atrocities, and argues that the most important point of leverage is pressure on the United Arab Emirates, which is arming the genociders. Sam Kriss also writes about the situation in The World’s First Matcha Labubu Genocide, but is unimpressed with Kristof’s take: Sudan is passed over in a deeply uncomfortable silence. The absolute most you can do is blame the Emiratis. From what I’ve seen, more people seem to be appalled at the UAE for its frankly marginal role in arming the RSF than at the RSF itself. This is the approved way of understanding any inscrutably indigenous foreign conflict: you just worm out any third-party involvement and then act like you’ve solved the whole thing. I side with Kristof here, for reasons that Sam himself touches on later in his piece, in a section comparing Darfur with Gaza. It would be very easy to make people care about Darfur again. All it would take is a loud, vocal contingent of RSF apologists in the Western media. I agree, but would frame it less cynically: the reason Westerners pay attention to Gaza is that there’s a lever to push: not only does America support Israel, but many of their friends support Israel, so they can imagine convincing America or at least their friends to stop, and at least feel like there is some remote chance of making a small difference (and in fact, Trump getting mad at Israel and deciding to pressure them was decisive in effecting the cease-fire). On the other hand, we don’t have many levers to affect ethnic Baggara in the Rapid Support Forces of Sudan, so it doesn’t really feel useful to write blog posts arguing that they should stop; obviously they should stop, nobody disagrees with this, and it goes without saying - so nobody says it. But the US does support the UAE, and many of our friends like the UAE or at least go there on vacation, so maybe it’s possible to have make some small difference by embarrassing them. 4D chess take is that Sam Kriss agrees with all of this, but “loudly” and “vocally” argued against it to give people like me a hook to write about this genocide with, in which case I thank him for his sacrifice. It would also be nice to be able to donate, but I don’t know who to trust in the region - other than Doctors Without Borders, who are usually pretty good. 47: The AI Futures Project (group of AI-will-be-fast intellectuals) and the AI As A Normal Technology team (group of AI-will-be-slow intellectuals) wrote an adversarial collaboration in Asterisk explaining what they agree on, for example: That there’s an important distinction between existing AI and “strong AGI”
> “There is one anomaly, which is that I remember people complaining about the bad economy and the Boomers and hellworld since well before 2020 (consider the Trump and Sanders campaigns), but the official vibes didn’t crash until COVID. Is my memory faulty?”
This is what they took from you. They never should have passed the ‘Make It Illegal To Wear Hair Gel And Marry A White Woman Act' back in 1959! He argues that the reason most wives work these days isn’t because we’re poorer (and they have to work to survive), but because we’re richer (and so wives can make so much money working outside the home that the opportunity cost is too high to pass up). A single earner could still support a family on a 1950s lifestyle. It would just feel like a failure, because we don’t realize how much worse than 1950s lifestyle was compared to our current conditions. The article’s paywalled, but you can get a pretty good sense of the argument from these paragraphs. After determining that the median man makes about $80,000/year, he writes: Let’s say our $80,000-a-year man is living in the Jacksonville area. The Department of Housing and Urban Development calculates what are called Fair Market Rents for each American metro — this means the 40th percentile rent for a home with any given set of characteristics. They say F.M.R. for a three-bedroom home in the Jacksonville area is $2,163. That comes out to about 30 percent of Mr. Median’s annual income. Can you really get a place to live for that little? Here’s a lovely three-bedroom home in the East Arlington neighborhood for $2,020 a month, and it’s zoned for an elementary school with a 10-out-of-10 ranking from GreatSchools. It’s true that 1,617 square feet is on the small side for, say, a family of five in the contemporary United States. But the average size of a new single family home was 1,289 square feet in 1960 and 1,500 square feet in 1970. Two of your kids are going to need to share a bedroom, but that’s how people lived back in the day. There’s more to life than housing, of course, but I started there because that’s the largest item in a household budget. Durable goods like furniture, cars, and appliances have all become better and more affordable since the mid-1960s. That’s partially offset by rising prices for things like college tuition, child care, and health care. But in the 1960s, most young people didn’t go to college. The way health insurance works, you only need one worker in your family to get a job-based health plan. And of course, with your wife serving as a full-time homemaker, you don’t need to worry about child care expenses. The big thing is that, with a larger family, you literally have a bunch of mouths to feed. But the model here is to replicate how people actually lived in the mid-1960s, which is that they dined out much less frequently and also spent a much larger share of their total income on food. When I try to retrace this, it seems possible, but barely. I imagined doing this in Sacramento, to be near family. Suppose I make $80K pretax = $6.6K/month pretax = $5K per month posttax. A cheap 3-bedroom house on a nice-enough block is $2200 mortgage, assume $3K after property taxes etc. A cheap new car is $350/month. Food can be arbitrarily low if you’re willing to eat rice all the time, but let’s say $250/month. CoveredCalifornia offered my family of four healthcare for $600/month. So top four expenses take $4200/month of the $5000/month pretax income. I don’t know; seems tough. I would like to see a more thorough breakdown of an average 2026 vs. 1956 man’s likely budget. There are also some areas where it’s harder to separate genuine declines from rising expectations. Most people in the 1950s didn’t have health insurance. Was that because they accepted lower levels of health, or because medical care was cheaper, and easy enough to afford out-of-pocket? Probably some very complicated combination of both. And it might be impossible to get certain kinds of 1950s medical care today, i.e. a bed in a cheap low-quality shared hospital room. (some of the best discussion around this came from the response to Elizabeth Warren’s The Two-Income Trap, see eg Matt Bruenig here) Still, I find this tangential to the main point. Yes, a few conservatives complain that it’s hard to have a single-income family. But most vibecession complaints come from singles or dual-earner households! 4: What About Other Countries? … Dionysus writes: Did you know that China also has a vibecession? If even China can’t regulate social media heavily enough to prevent this phenomenon, how can any liberal society possibly hope to? The link goes to an NYT article, which includes quotes like: Using apps like RedNote and Douyin, people are reviving memories of the 2000s and the early 2010s with photos of daring outfits, upbeat songs and vintage TV commercials, all of which, in different ways, evoke a time in China that pulsed with optimism. “The music back then throbbed with exuberance, brimming with the sense that the future could only get brighter,” a middle-aged man said in a RedNote video. “Today’s lyrics begin with lines like, ‘We’re trying our best to survive.’” And The boom-time beauty meme is the latest expression of a Gen Z counterculture born of disillusionment, the recognition that they may be the first generation in half a century unlikely to surpass their parents’ standard of living, no matter how hard they try. Over the past five years, this quiet resistance has taken many forms. It began with “lying flat,” a refusal to join the rat race. Some chose to pursue the “run philosophy,” or emigrating in search of freedom and brighter prospects. Others declared themselves the “last generation,” vowing not to have children. Still others embraced “let it rot,” giving up on difficult goals rather than battling for uncertain rewards. To show they could care less about career prospects, many took to wearing “gross outfits” at work. This is especially crazy in China, where GDP per capita is now ten times what it was back during the “Boom Years” that everyone reminisces about. This might be the smoking gun that people’s economic beliefs are totally unmoored from how rich they are. The Chinese story has an obvious moral: people care about growth rate more than level. But even this doesn’t work for America - our Vibecession doesn’t correspond to a period of unusually low growth. machine_spirit writes: It’s interesting to compare it to Europe as the control group. Unlike the US, whose economy muddled through just fine during the last decade, we are currently experiencing a massive economic decline that could soon turn into a full-blown collapse. And yet, outside of debates about immigration or foreign policy especially regarding Ukraine you don’t really hear the same level of rancour about ‘things being bad’ in the local media. I’m surprised to hear this. I hear many economic complaints from Europeans, but I suppose this passes through my own American filter bubble which is incentivized to talk about economic hardship for its own American reasons. Golden Feather writes: I am an Italian currently living in the US. My main guesses would be: Right-wing parties control a supermajority of TV and print media. They have also been in the govt most of the time, which means they control the state TV and have an interest in presenting things as rosey. The much older population makes the internet less relevant for public sentiment. Even in the few years where they were at the opposition, they mostly focused on immigration and crime to rile up popular sentiment, I guess because the population is older, their voters even moreso, so they care more about that than about the economy
Graffiti: There are no good data for graffiti. Most of the discussion focuses on New York, where everyone agrees the long-term trend is down since 1970. The Graffiti In New York City Wikipedia page has a “decline of New York graffiti subculture” section, which explains that in the 1980s, when “broken window” policing became popular, the police cracked down on graffiti and this worked somewhat. The only numbers are here, and they describe a decrease of 13% in calls to the graffiti hotline between 2011 and 2016. But the more recent picture, and the story in other cities, is less sanguine; in the past few years, graffiti is “a bigger problem than ever” in Los Angeles and has “gotten worse” in San Francisco. Plausibly this is the same pattern as crime, which was declining for decades until COVID and the Black Lives Matter protests caused it to rebound in 2020. A contrary data point is Britain, where graffiti reports almost doubled between 2013 - 2017; I don’t know enough about the British context to have an opinion.
I’ve confirmed the post 2009 trend; I haven’t fully double-checked the others but they match my impressions. This looks like a similar pattern to crime, although here the likely explanation for the COVID bump is the pandemic-associated rise in house prices. Good measures of tent encampments over long periods are hard to find. San Francisco has this one: …but it starts in 2019, peaks during the pandemic, and then declines. This can’t really show whether 2019 was already higher than some previous year. Here is an interesting graph of Seattle homeless sweeps, ie number of times the police acted against encampments: …but it doesn’t tell us whether encampments are increasing, or the police are taking them more seriously. It does rule out a story where encampments are increasing because the police are no longer taking action - aside from the pandemic, police are taking more action than ever, at least as measured here. People With Loud Boom Boxes In Public Places: All I have to say about this one is that it’s terrible and I hate it. Overall, it’s surprisingly hard to find data confirming that disorder has increased: Littering seems to be down