DerSimonian-Laird test
Article
DerSimonian-Laird test is a recurring concept in the Astral Codex Ten archive, appearing 2 times across 2 issues between June 05, 2022 and February 01, 2023. The archive places it in contexts such as “I should have used a DerSimonian-Laird test because it’s a meta-analysis”; “a DerSimonian-Laird test, and applies it to the same data”. It most often appears alongside ivermectin, 2006 Ioannidis paper, ACTIV-6.
Metadata
- Category: Concepts
- Mention count: 2
- Issue count: 2
- First seen: June 05, 2022
- Last seen: February 01, 2023
Appears In
Related Pages
-
- ivermectin (2 shared issues)
-
- 2006 Ioannidis paper (1 shared issues)
-
- ACTIV-6 (1 shared issues)
-
- Alexandros (1 shared issues)
-
- Alexandros Marinos (1 shared issues)
-
- American (1 shared issues)
-
- Aref (1 shared issues)
-
- Argentina (1 shared issues)
-
- Argentine (1 shared issues)
-
- Astralcodexten Com (1 shared issues)
-
- Australia (1 shared issues)
-
- azithromycin (1 shared issues)
External Links
Source Context
Recovered passages from the original issue text. When the raw archive preserved outbound links inside the source passage, they are listed directly under the quote.
2: In my ivermectin post, about a third of the way down, are two analyses of whether a raw meta-analysis makes it look like ivermectin works. I concluded that they showed marginal effect, but that this was probably due to other factors (eg antiparasitic properties). A reader points out that it was wrong to do this by t-test, and I should have used a DerSimonian-Laird test because it’s a meta-analysis, which would have shown a clear (not marginal) effect, so I updated the post and my Mistakes page. More recently, another reader has commented that a DerSimonian-Laird test is also inappropriate because the studies aren’t homogenous, and now I’m not sure which test is appropriate or what result it would give - but it definitely wasn’t the one I originally tried. I don’t think this significantly alters the overall conclusion of the post, which was that the apparent effect (whether marginal or clear) was better explained by other things.
“Synthetic control groups” - ie comparing people in a trial to some previously-known understanding of how a disease progresses - are a standard practice, and basically fine. Borody et al indeed have had amazing careers with many things they can be proud of. But I continue to believe that this paper is not among them. Synthetic control groups are more common in social sciences, but have occasionally been used in pharmacology when it would be unethical or extremely difficult to use a real control group. The most common use case is rare cancers, where it takes years to get enough patients to test a drug and it also seems kind of unethical to delay. Another good thing about rare cancers is that they're pretty discrete; you don't have to worry about things like "well, 90% of leukemias never make it to a doctor anyway, so maybe we're only seeing the serious leukemias" or "these guys counted the leukemias that get dealt with by the local doctors' office, but those other guys counted the leukemias that have to go to the hospital". More important, studies with synthetic control groups usually go above and beyond to justify why their synthetic control group should be a fair comparison to the treatment group. Here's an example, from a paper about a rare leukemia. They start by getting a synthetic control group from a previous randomized controlled trial of leukemia drugs (not the general population!) Then they throw out more than half their patients for not being a good match for the selection criteria of the current study. Then they investigate whether there are significant differences on five important demographic factors, and find a few. Then they re-weight the patients in the historical comaprator study to adjust out the differences between the previous population and the current population. Then they do some analyses to check if they re-weighted everything correctly. Then they apologize profusely for having to use this vastly inferior methodology at all: In special cases when a disease is rare, prognosis is very poor, and there are limited therapeutic options available, single-arm clinical trials may be used as evidence for accelerated drug approvals. Comprehensive evaluation of historical comparator or reference data can provide an additional approach for putting the efficacy of a new therapy into perspective.11, 12 In this study, we applied different statistical methods and sensitivity analyses to evaluate the clinical efficacy of blinatumomab against historical data. Concerns often raised regarding the use of historical comparator data are the influence of potential biases related to selection, misclassification and confounding.12 The requirement of rigorous eligibility criteria in the blinatumomab clinical study—such as Eastern Cooperative Oncology Group status of two or lower and absence of abnormal lab values during screening—may increase the chance of better outcomes in the clinical study than the historical data. While it may be possible to use unadjusted historical data when patient populations are sufficiently similar,27 the disproportionate number of advanced-stage patients in the blinatumomab trial required methods applied to individual-level data to minimize bias. Selection bias was minimized by use of stringent inclusion criteria into the historical data set and by weighting or adjusting for known prognostic factors. In addition, the historical data set represented adult R/R patients who received standard of care (excluding palliative care patients where possible), without any restrictions to any patient subgroups. Residual confounding may still remain and be difficult to control for, particularly in data sets where differences in important prognostic factors are unknown or not measured in one data set. In this study, nearly all known important prognostic factors were adjusted for in the weighted or propensity score analyses. Missing data on key covariates lead to exclusion of some records from the analyses (Figure 1), which may theoretically bias the overall results. However, our examination of records with missing covariates did not identify significant differences by patient demographic characteristics compared with patients who had complete data (data not shown). Misclassification bias was limited by harmonization of patient-level data in the pooled analysis, which employed common data definitions for disease classification and outcomes characterization. Compare this to how the Borody study discusses its synthetic control group: The control data was from contemporary infected subjects in Australia obtained from published Covid Tracking Data. I hesitate to say “they didn’t even say which tracking data”, because in the past I’ve said things like that and just missed it. But I can’t find them saying which tracking data. In Borody et al’s synthetic control group, 70/600 (11.5%) patients required hospitalization. But the US hospitalization rate appears to be about 1% for unvaccinated individuals. So Borody’s synthetic control group got 10x the expected hospitalization rate. This seems very relevant to this study finding that ivermectin decreases hospitalization by 90%! I’m not claiming this is fraudulent, or impossible, or means the study couldn’t have been good. And Borody claim to have used an “equivalent” control group, so maybe there was some adjustment done for this. But this is why we usually use more than one word to describe our control groups! Or use real control groups that don’t ruin your study if you do a finicky adjustment slightly wrong! I feel like these are the kinds of questions Alexandros needs to be asking, instead of just giving a link to a Stat News article about how sometimes synthetic control groups are okay. Also other questions, like “how come this found a 90% decrease in hospitalization and mortality, but lots of other studies found smaller decreases, and the biggest and best studies found none at all?” I know Alexandros’ answers are to find lots of flaws with the biggest and best studies, but these flaws wouldn’t be enough to cover up a 90% cure rate. And if you’re in the business of calling out flaws in studies I genuinely think having your control group be “we used some group of people somewhere in Australia, they had 10x the normal hospitalization rate, we won’t tell you anything else” would be the sort of flaw you would call out! Thomas Borody is a genuinely brilliant gastroenterologist and I am very grateful for his life-saving discoveries. But Elon Musk is a genuinely brilliant engineer and I am very grateful for his low-cost reusable rockets - and this doesn’t mean he never does crazy inexplicable things. Maybe Borody and his collaborators have a point from this study, but I don’t feel like it makes sense as written. If they ever explain what they were doing in more detail and it’s some sort of amazing 4D-chess move that makes total sense, I will apologize to them. Otherwise, stick to inventing amazing life-saving digestive therapies. In response to this section, Alexandros stresses that he is not necessarily saying Borody et al is incorrect or challenging my decision to leave it out. He writes: I will repeat that my strong objection, is that you wrote " this is not how you control group, @#!% you". I therefore pointed to stat news to support my case that, yes, this can indeed be how you control group. That's all. In the article I even noted that this aversion towards disrespect to elders may even be a cultural difference between us. To be clear, if I were making a case for ivermectin, I would not be relying on this study as my starting point. III. Hokey Meta-Analysis Alexandros points out that I used the wrong statistical test when analyzing the overall picture gleaned from this studies. He’s right. The right statistical test would make ivermectin look stronger, without changing the sign of the conclusion. After getting a core group of potentially trustworthy studies, I tried to see whether ivermectin still had a statistically significant positive effect in them. I tried to be honest that I didn’t really know how to do formal meta-analyses: Probably I’m forgetting some reason I can’t just do simple summary statistics to this, but whatever. It is p = 0.15, not significant . . . What happens if I unprincipledly pick whatever I think the most reasonable outcome to use from each study is? . . . Now it’s p = 0.04, seemingly significant I in fact could not do simple summary statistics to this. Alexandros describes the test I should have used, a DerSimonian-Laird test, and applies it to the same data. Now the numbers are p = 0.03 and p < 0.0001. I accept that I was wrong, he is right, and this is more accurate. My original conclusion to this section is that although you couldn’t be absolutely sure from the numbers, eyeballing things it definitely looked like ivermectin had an effect. I then went on to try to explain that effect. With Marinos’ corrections, you can be sure from the numbers, but the rest of the post - an attempt to explain the effect - still stands. IV. Worms Alexandros brings up issues with the Strongyloides hypothesis; Dr. Bitterman graciously responds. I find the issues real enough to lower my credence in the idea, but not to completely rule it out. Even if it is true, I probably overestimated how important it was. My original explanation for the effect was Dr. Avi Bitterman’s theory of Strongyloides hyperinfection. Many people in certain tropical regions are infected with the parasitic worm Strongyloides. Usually a person’s immune system keeps this worm under control, and the parasites cause only limited problems. But under certain situations - especially when people take immune-suppressing corticosteroids - the immune system fails, the worms multiply, and the patient can potentially die of sudden worm overgrowth (“hyperinfection”). Corticosteroids are a common COVID treatment. So plausibly some people in tropical areas fighting COVID are at risk of dying from worm hyperinfection. Ivermectin was originally an anti-parasitic-worm medication before being repurposed to fight COVID, and everyone agrees it is very good at this. So if many people in COVID trials are dying of worm infections, then ivermectin could help them. This would look like ivermectin reducing mortality in COVID trials, and make people wrongly conclude that ivermectin treats COVID. Alexandros responds to this theory here, again I’ll try to summarize: The original Bitterman paper concludes that ivermectin trials show stronger results in high-Strongyloides-prevalence regions. But it mixes prevalence data from two different papers with different methodologies. Correcting for this, the findings no longer clear a formal bar for statistical significance, and don’t really look significant either.
Inline links: are a standard practice, Here's an example, 11, 12, 27, Figure 1, about 1% for unvaccinated individuals, here