Green
Article
Green is a recurring person in the Astral Codex Ten archive, appearing 3 times across 3 issues between August 30, 2021 and December 31, 2025. The archive places it in contexts such as “Green (1963) , the results were argued to sho”; “in Green (1963)”; “Green is one of these five studies, and it does superficially find loss aversion”. It most often appears alongside COVID, Cremieux, Italy.
Metadata
- Category: People
- Mention count: 3
- Issue count: 3
- First seen: August 30, 2021
- Last seen: December 31, 2025
Appears In
- On Hreha On Behavioral Economics
- Prison And Crime: Much More Than You Wanted To Know
- Highlights From The Comments On Vibecession
Related Pages
-
- COVID (2 shared issues)
-
- Cremieux (2 shared issues)
-
- Italy (2 shared issues)
-
- Philadelphia (2 shared issues)
-
- US (2 shared issues)
-
- 1955 (1 shared issues)
-
- 4chan (1 shared issues)
-
- AARP (1 shared issues)
-
- Abrams 2012 (1 shared issues)
-
- Acceptable Losses (1 shared issues)
-
- Acceptable Losses: The Debatable Origins of Loss Aversion (1 shared issues)
-
- ACLU (1 shared issues)
External Links
Source Context
Recovered passages from the original issue text. When the raw archive preserved outbound links inside the source passage, they are listed directly under the quote.
I find I usually click the third box on both. I want to tip generously, but giving the maximum possible tip seems profligate. Surely the third box is the right compromise. I recently noticed that this is insane. For a $35 meal, I’m giving GrubHub drivers $3 and UberEats drivers $7 for the same service (or maybe there’s some difference between their services which makes UberEats suggest the higher tip - but if there is, I don’t know about it and it doesn’t affect my decision). Again, this is Behavioral Economics 101 - in particular, one of the many biases lumped together under menu effects. Instead of being a rational economic actor who values food delivery at a certain price, I’m trying to be a third-box-of-four kind of guy. That means that whoever is in charge of this menu has lots of power over the specific dollar amount I give. Not infinite power - if the third box said $1000 I would notice and refuse. But enough power that “nudging” seems like a fair description. Nobody believes studies anymore, which is fair. I trust in a salvageable core of behavioral economics and “nudgenomics” because I can feel in my bones that they’re true for me and the people around me. Let’s move on to Hreha’s article and see if we can square it with my belief in a “salvageable core”. II. Yechaim’s Historical Detective Story Hreha writes: The biggest replication failures relate to the field's most important idea: loss aversion. To be honest, this was a finding that I lost faith in well before the most recent revelations (from 2018-2020). Why? Because I've run studies looking at its impact in the real world—especially in marketing campaigns. If you read anything about this body of research, you'll get the idea that losses are such powerful motivators that they'll turn otherwise uninterested customers into enthusiastic purchasers. The truth of the matter is that losses and benefits are equally effective in driving conversion. In fact, in many circumstances, losses are actually *worse* at driving results. Why? Because loss-focused messaging often comes across as gimmicky and spammy. It makes you, the advertiser, look desperate. It makes you seem untrustworthy, and trust is the foundation of sales, conversion, and retention. "So is loss aversion completely bogus?" Not quite. It turns out that loss aversion does exist, but only for large losses. This makes sense. We *should* be particularly wary of decisions that can wipe us out. That's not a so-called "cognitive bias". It's not irrational. In fact, it's completely sensical. If a decision can destroy you and/or your family, it's sane to be cautious. "So when did we discover that loss aversion exists only for large losses?" Well, actually, it looks like Kahneman and Tversky, winners of the Nobel Prize in Economics, knew about this unfortunate fact when they were developing Prospect Theory—their grand theory with loss aversion at its center. Unfortunately, the findings rebutting their view of loss aversion were carefully omitted from their papers, and other findings that went against their model were misrepresented so that they would instead support their pet theory. In short: any data that didn't fit Prospect Theory was dismissed or distorted. I don't know what you'd call this behavior... but it's not science. This shady behavior by the two titans of the field was brought to light in a paper published in 2018: "Acceptable Losses: The Debatable Origins of Loss Aversion". I encourage you to read the paper. It's shocking. This line from the abstract sums things up pretty well: "...the early studies of utility functions have shown that while very large losses are overweighted, smaller losses are often not. In addition, the findings of some of these studies have been systematically misrepresented to reflect loss aversion, though they did not find it." When the two biggest scientists in your field are accused of "systemic misrepresentation", you know you've got a serious problem. Which leads us to another paper, published in 2018, entitled "The Loss of Loss Aversion: Will It Loom Larger Than Its Gain?". The paper's authors did a comprehensive review of the loss aversion literature and came to the following conclusion: "current evidence does not support that losses, on balance, tend to be any more impactful than gains." Yikes. But given the questionable origins of the field, it's not surprising that its foundational finding is *also* dubious. If loss aversion can't be trusted, then no other idea in the field can be trusted. This argument relies on two papers - Yechaim’s Acceptable Losses and Gal & Rucker’s Loss Of Loss Aversion. Yechaim’s paper is a historical detective story. It looks at how Kahneman and Tversky first “discovered” and popularized the idea of loss aversion from earlier 1950s and 1960s research. It concludes they did a bad job summarizing this earlier research; looked at carefully, it doesn’t support the strong conclusions they drew. From one perspective, nobody should care about this. All the 1950s and 1960s research was terrible - one of the most important studies it discusses had n = 7. Since then, we’ve had much more rigorous studies of tens of thousands of people. All that hinges on Yechaim’s paper is whether Kahneman and Tversky were personally bad people. Hreha thinks they were. He calls their behavior “shady”, “shocking”, and says they “systematically misrepresented findings to support their pet theory…I don't know what you'd call this behavior... but it's not science.” Again, nothing important really hinges on this, but I feel like fighting about it, so let’s look deeper anyway. Here’s how Yechaim summarizes his accusation against K&T: In addition, the results of several studies seem to have been misrepresented by Fishburn and Kochenberger (1979) and Kahneman and Tversky (1979). Galenter and Pliner (1974) were wrongly cited as showing loss aversion, whereas, in fact, they did not observe an asymmetry in the pleasantness ratings of gains and losses. Likewise, in Green (1963), the results were argued to show loss aversion, even though this study did not involve any losses. In addition, the objective outcomes for some of the participants in Grayson (1960) were transformed by Fishburn and Kochenberger (1979) so as to better support a model assuming different curvatures for gains and losses (see Table 1). Finally, studies showing no loss aversion or suggesting aversion to large losses were not cited in Fishburn and Kochenberger (1979) or in Kahneman and Tversky (1979). Yechaim bases his argument on three sets of early studies of loss aversion: Galenter and Plinter (1974), Fishburn and Kochenberger’s review (1979) and miscellaneous others. —Galenter and Plinter— is actually really neat! It explores “cross-modal” perceptions of gains versus losses. That is, if you ask how much a certain loss hurt, people will probably just say something like “I dunno, a little?” and then it will be hard to turn that into a p-value. G&P solve this by making people listen to loud noises, and asking questions like “is the difference between how much loss A and loss B hurt greater or lesser than the difference between the volume of noise 1 and noise 2?” The idea is that the brain uses a bunch of weird non-numerical scales for everything, and we understand its weird-non-numerical scale for noise volume pretty well, and so maybe we can compare it to how people think about gains or losses. I don’t know why people in 1974 were doing anything this complicated instead of inventing the basic theory of loss aversion the way Kahneman and Tversky would five years later, but here we are. Anyway, Yechaim concludes that this study failed to find loss aversion: Summing up their findings, Galenter and Pliner (1974) reported as follows: “We now turn to the question of the possible asymmetry of the positive and negative limbs of the utility function. On the basis of intuition and anecdote, one would expect the negative limb of the utility function to decrease more sharply than the positive limb increases... what we have observed if anything is an asymmetry of much less magnitude than would have been expected ... the curvature of the function does not change in going from positive to negative” (p. 75). Thus, our search for the historical foundations of loss aversion turns into a dead end on this particular branch: Galenter and Pliner (1974) did not observe such an asymmetry; and their study was quoted erroneously [by Kahneman and Tversky]. I looked for the full text of Galenter and Pliner, but could not find it. I was however able to find the first two pages, including the abstract. The way Galenter and Pliner summarize their own research is: Cross-modality matching of hypothetical increments of money against loudness recover the previously proposed exponent of the utility function for money within a few percent. Similar cross-modality matching experiments for decrements give a disutility exponent of 0.59, larger than the utility exponent for increments. This disutility exponent was checked by an additional cross-modality matching experiment against the disutility of drinking various concentrations of a bitter solution. The parameter estimated in this fashion was 0.63. If I understand the bolded part right, the abstract seems to be saying that they did find loss aversion! I was also able to find the Google Books listing for the book that the study was published in. Its summary is: Three experiments were conducted in which monetary increments and decrements were matched to either the loudness of a tone or the bitterness of various concentrations of sucrose octa-acetate. An additional experiment involving ratio estimates of monetary loss is also reported. Results confirm that the utility function for both monetary increments and decrements is a power function with exponents less than one. The data further suggest that the exponent of the disutility function is larger than that of the utility function, i.e., the rate of change of 'unhappiness' caused by monetary losses is greater than the comparable rate of 'happiness' produced by monetary gains. (Author). Again, the way the book is summarized (apparently by the author) says this study does prove loss aversion. Without being able to access the full study, I’m not sure what’s going on. Possibly the study found loss aversion, but it was less than expected? Still, I feel like Yechaim should have mentioned this. At the very least, it decreases Kahneman and Tversky’s crime from “lied about a study to support their pet theory” to “credulously believed the authors’ own summary of their results and didn’t dig deeper”. But also, why did the authors believe their study showed loss aversion? Why does Yechaim disagree? Without being able to access the full paper, I’m not sure. —Green 1963— is the second study that Yechaim accuses K&T of misrepresenting. Here’s how K&T cite this study in their paper: It is of interest that the main properties ascribed to the value function have been observed in a detailed analysis of von Neumann-Morgenstern utility functions for changes of wealth (Fishburn and Kochenberger [14]). The functions had been obtained from thirty decision makers in various fields of business, in five independent studies [5, 18, 19, 21, 40]. Most utility functions for gains were concave, most functions for losses were convex, and only three individuals exhibited risk aversion for both gains and losses. With a single exception, utility functions were considerably steeper for losses than for gains. Green 1963 is footnote 19. So K&T don’t even mention it by name. They mention it as one of several studies that a review article called Fishburn and Kochenberger analyzes. F&K are reviewing a bunch of studies of executives. In each study, a very small number of executives (usually about 5-10 per study) make a hypothetical business decision comparing gains and losses, for example: Suppose your company is being sued for patent infringement. Your lawyer’s best judgement is that your chances of winning the suit are 50–50; if you win, you will lose nothing, but if you lose, it will cost the company $1,000,000. Your opponent has offered to settle out of court for $200,000. Would you fight or settle? Then they ask the same question with a bunch of other numbers, and plot implied utility functions for each executive based on the answer. Green is one of these five studies, and it does superficially find loss aversion. But Fishburn and Kochenberger have done something weird. They argue that “loss” and “gain” aren’t necessarily objective, and usually correspond to “loss relative to some reference frame” (so far, so good). In order to figure out where the reference frame is, they assume that the neutral point is wherever “something unusual happens to the individual’s utility function” (F&K’s words). So they shift the zero point separating losses and gains to wherever the utility function looks most interesting! After doing this, they find “loss aversion”, ie the utility curve changes its slope at the transition between the loss side and the gain side. But since the transition was deliberately shifted to wherever the utility curve changed slope, this is almost tautological. It isn’t quite tautological: it’s interesting that most of the utility curves had a sharp transition zone, and it’s interesting that the transition was in the direction of loss-aversion rather than gain-seeking. But it’s tautological enough to be embarrassing. Still, this is Fishburn and Kochenberger’s embarrassment, not Kahneman and Tversky’s. And Fishburn and Kochenberger included this study in their review alongside several other studies that didn’t do this to the same degree. Kahneman and Tversky just cited the review article. I don’t think citing a review article that does weird things to a study really qualifies as “systematic misrepresentation.” I guess I’m having a hard time figuring out how angry to be, because everything about Fishburn and Kochenberger is terrible. The average study in F&K includes results from 5-10 executives. But the studies are pretty open about the fact that they interviewed more executives than this, threw away the ones who gave boring answers, and just published results from the interesting ones. Then they moved the axes to wherever looked most interesting. Then they used all this to draw sweeping generalizations about human behavior. Then F&K combined five studies that did this into a review article, without protesting any of it. And then K&T cited the review article, again without protesting. I have to imagine that all of this was normal by the standards of the time. I have looked up all these people and they were all esteemed scientists in their own day. And I believe the evidence shows K&T summarized F&K faithfully. Shouldn’t they have avoided citing F&K at all? Seems like the same kind of question as “Shouldn’t Pythagoras have published his theorem in a peer-reviewed journal, instead of moving to Italy, starting a cult, and exposing his thigh at the Olympic Games as part of a scheme to convince people he was the god Apollo?” Yes, but the past was a weird place. As best I can tell, K&T’s citation of G&P agrees with the authors’ own assessment of their results. Their citation of F&K agrees with the reviewers’ assessment and with a charitable reading of most of the studies involved, although those studies are terrible in many ways which are obvious to modern readers. I would urge people interested in the whodunit question to read Kahneman and Tversky’s original paper. I think it paints the picture of a team very interested in their own results and in theory, and citing other people only incidentally, and in accordance with the scientific standards of their time. I don’t feel a need to tar them as “misrepresenters”. III. Okay, But Is Loss Aversion Real? Remember, all that is about the personal deficiencies of Kahneman and Tversky. Realistically there have been hundreds of much better studies on loss aversion in the forty years since they wrote their article, so we should be looking at those. Here Hreha cites Gal & Rucker: The Loss Of Loss Aversion: Will It Loom Larger Than Its Gain? It’s a great 2018 paper that looks at recent evidence and concludes that loss aversion doesn’t exist. But it’s a very specific, interesting type of nonexistence, which I think the Hreha article fails to capture. G&R are happy to admit that in many, many cases, people behave in loss-averse ways, including most of the classic examples given by Kahneman and Tversky. They just think that this is because of other cognitive biases, not a specific cognitive bias called “loss aversion”. They especially emphasize Status Quo Bias and the Endowment Effect. Status Quo Bias is where you prefer inaction to action. Suppose you ask someone “Would you bet on a coin flip, where you get $60 if heads and lose $40 if tails?”. They say no. This deviates from rational expectations, and one way to think of this is loss aversion; the prospect of losing $40 feels “bigger” than the prospect of gaining $60. But another way to think of it is as a bias towards inaction - all else being equal, people prefer not to make bets, and you’d need a higher payoff to overcome their inertia. Endowment Effect is where you value something you already have more than something you don’t. Suppose someone would pay $5 to prevent their coffee mug from being taken away from them, but (in an alternative universe where they lack a coffee mug) would only pay $3 to buy one. You can think of this as loss aversion (the grief of losing a coffee mug feels “bigger” than the joy of gaining one). Or you can think of it as endowment (once you have the coffee mug, it’s yours and you feel like defending it). These are really fine distinctions; I had to read the section a few times before the difference between loss aversion and endowment effect really made sense to me. Kahneman and Tversky just sort of threw all all this stuff out and saw what stuck and didn’t necessarily try super hard to make sure none of the biases they discovered were entirely explainable as combinations of some of the others. G&R think maybe loss aversion is. They do some clever work setting up situations that test loss aversion but not status quo or endowment - for example, offering a risky bet vs. a safer bet. Here they find no evidence for loss aversion as a separate force from the other two biases. Somewhere in this process, they did an experiment where they gave participants a quarter minted in Denver and asked them if they wanted to exchange it for a quarter minted in Philadelphia. 60% of people very reasonably didn’t care, but another 35% had grown attached to their Denver quarter, with only 5% actively seeking the novelty of Philadelphia. Psychology is weird. I understand why some people would summarize this paper as “loss aversion doesn’t exist”. But it’s very different from “power posing doesn’t exist” or “stereotype threat doesn’t exist”, where it was found that the effect people were trying to study just didn’t happen, and all the studies saying it did were because of p-hacking or publication bias or something. People are very often averse to losses. This paper just argues that this isn’t caused by a specific “loss aversion” force. It’s caused by other forces which are not exactly loss aversion. We could compare it to centrifugal force in physics: real, but not fundamental. Also, you can’t use this paper to argue that “behavioral economics is dead”. At best, the paper proves that loss aversion is better explained by other behavioral economic concepts. But you can’t get rid of behavioral econ entirely! The stuff you have to explain is still there! It’s just a question of which parts of behavioral econ you use to explain it. Complicating this even further is Mrkva et al, Loss Aversion Has Moderators, But Reports Of Its Death Are Greatly Exaggerated (h/t Alex Imas, who has a great Twitter thread about this). This is an even newer paper, 2019, which argues that Gal and Rucker are wrong, and loss aversion does have an independent existence as a real force. There are many things to like about this paper. Previous criticisms of loss aversion argue that most experiments are performed on undergrads, who are so poor that even small amounts of money might have unusual emotional meaning. Mrkva collects a sample of thousands of millionaires (!) and demonstrates that they show loss aversion for sums of money as small as $20. On the other hand, I’m not sure they’re quite as careful as G&R at ruling out every other possible bias (although I don’t have a great understanding of where the borders between biases are and I can’t say this for sure). The main point I want to make is that all the scientists in this debate seem smart, thoughtful, and impressive. This isn’t like social priming experiments where one person says a crazy thing, nobody ever replicates it at scale, and as soon as someone tries the whole thing collapses. These have been replicated hundreds of times, with the remaining arguments being complicated semantic and philosophical ones about how to distinguish one theory from a very slightly different theory. If that takes replicating your result on a sample of thousands of millionaires, people will gather a sample of thousands of millionaires and get busy on the replication. Just overall really impressive work. I don’t feel qualified to take a side in the G&R vs. Mkrva debate, but both teams make me really happy that there are smart and careful people considering these questions. And this is just a drop in the bucket. Alex Imas also links Replicating patterns of prospect theory for decision under risk, which says: Though substantial evidence supports prospect theory, many presumed canonical theories have drawn scrutiny for recent replication failures. In response, we directly test the original methods in a multinational study (n = 4,098 participants, 19 countries, 13 languages), adjusting only for current and local currencies while requiring all participants to respond to all items. The results replicated for 94% of items, with some attenuation. Twelve of 13 theoretical contrasts replicated, with 100% replication in some countries. Heterogeneity between countries and intra-individual variation highlight meaningful avenues for future theorizing and applications. We conclude that the empirical foundations for prospect theory replicate beyond any reasonable thresholds. Beyond any reasonable thresholds! IV. Do Nudges Work? or, How Small Is Small? Continuing through the Hreha article: For a number of years, I've been beating the anti-nudge drum. Since 2011, I've been running behavioral experiments in the wild, and have always been struck by how weak nudges tend to be. In my experience, nudges usually fail to have *any* recognizable impact at all. This is supported by a paper that was recently published by a couple of researchers from UC Berkeley. They looked at the results of 126 randomized controlled trials run by two "nudge units" here in the United States. I want you to guess how large of an impact these nudges had on average... 30%? 20%? 10%? 5%? 3%? 1.5%? 1%? 0%? If you said 1.5%, you'd be right (the actual number is 1.4%, but if I had written that out you would have chosen it because of its specificity). According to the academic papers these nudges were based upon, these nudges should have had an average impact of 8.7%. But, as you probably understand by now, behavioral economics is not a particularly trustworthy field. I actually emailed the authors of this paper, and they thought the ~1% effect size of these interventions was something to be applauded—especially if the intervention was cheap & easy. Unfortunately, no intervention is truly cheap or easy. Every single intervention requires, at the very minimum, administrative overhead. If you're going to do something, you need someone (or some system) to implement and keep track of it. If an intervention is only going to get you a 1% improvement, it's probably not even worth it. Uber infamously had a team of behavioral economists working on its product, trying to “nudge” people in the right direction. Relatedly, Uber makes $10 billion in yearly revenue. If they can “nudge” people to spend 1% more, that’s $100 million. That’s not much relative to revenue, but it’s a lot in absolute terms. In particular, it pays the salary of a lot of behavioral economists. If you can hire 10 behavioral economists for $100,000 a year and make $100 million, that’s $99 million in profit. Or what if you’re a government agency, trying to nudge people to do prosocial things? There are about 90 million eligible Americans who haven’t gotten their COVID vaccine, and although some of them are hard-core conspiracy theorists, others are just lazy or nervous or feel safe already. (source) Whoever decided on that grocery gift card scheme was nudging, whether or not they have an economics degree - and apparently they were pretty good at it. If some sort of behavioral econ campaign can convince 1.5% of those 90 million Americans to get their vaccines, that’s 1.4 million more vaccinations and, under reasonable assumptions, maybe a few thousand lives saved. Hreha says that: Every single intervention requires, at the very minimum, administrative overhead. If you're going to do something, you need someone (or some system) to implement and keep track of it. If an intervention is only going to get you a 1% improvement, it's probably not even worth it. This depends on scale! 1% of a small number isn’t worth it! 1% of a big number is very worth it, especially if that big number is a number of lives! A few caveats. First, a small number only matters if it’s real. It’s very easy to get spurious small effects, so much so that any time you see a small effect you should wonder if it’s real. I’m ready to be forgiving here because behavioral economics is so well-replicated and common-sensically true, but I wouldn’t blame anyone who steers clear. Second, Hreha says: To be honest, you can probably use your creativity to brainstorm an idea that will get you a 3-4% minimum gain, no behavioral economics "science" required. Which leads me to the final point I'd like to make: rules and generalizations are overrated. The reason that fields like behavioral economics are so seductive is because they promise people easy, cookie-cutter solutions to complicated problems. Figuring out how to increase sales of your product is hard. You need to figure out which variables are responsible for the lackluster interest. Is the price the issue? Is the product too hard to use? Is the design tacky? Is the sales organization incompetent? Is the refund/return policy lacking? etc. Exploring these questions can take months (or years) of hard work, and there's no guarantee that you'll succeed. If, however, a behavioral economist tells you that there are nudges that will increase your sales by 10%, 20%, or 30% without much effort on your part... Whoa. That's pretty cool. It's salvation. Thus, it's no surprise that governments and companies have spent hundreds of millions of dollars on behavioral "nudge" units. Unfortunately, as we've seen, these nudges are woefully ineffective. Specific problems require specific solutions. They don't require boilerplate solutions based on general principles that someone discovered by studying a bunch of 19 year old college students. However, the social sciences have done a good job of convincing people that general principles are better solutions for problems than creative, situation-specific solutions. In my experience, creative solutions that are tailor-made for the situation at hand *always* perform better than generic solutions based on one study or another. Hreha is a professional in this field, so presumably he’s right. Still, compare to medicine. A thoughtful doctor who tailors treatment to a particular patient sounds better (and is better) than one who says “Depression? Take this one all-purpose depression treatment which is the first thing I saw when I typed ‘depression’ into UpToDate”. But you still need medical journals. Having some idea of general-purpose laws is what gives the people making creative solutions something to build upon. (also, at some point your customers might want to check your creative solution to see whether it actually gives a “3-4% minimum gain, no behavioral economics required”, and that would be at least vaguely study-shaped.) Third, everyone who said nudging had vast effects is still bad and wrong. Many of them were bad and wrong and making fortunes consulting for companies about how to implement the policies they were claiming were super-powerful. This is suspicious and we should lower our opinion of them accordingly. In a previous discussion of growth mindset, I wrote: Imagine I claimed our next-door neighbor was a billionaire oil sheik who kept thousands of boxes of gold and diamonds hidden in his basement. Later we meet the neighbor, and he is the manager of a small bookstore and has a salary 10% above the US average... Should we describe this as “we have confirmed the Wealthy Neighbor Hypothesis, though the effect size was smaller than expected”? Or as “I made up a completely crazy story, and in unrelated news there was an irrelevant deviation from literally-zero in the same space”? All the people talking about oil sheiks deserve to get asked some really uncomfortable questions. And a lot of these will be the most famous researchers - the Dan Arielys of the world - because of course the people who successfully hyped their results a lot are the ones the public knows about. Still, the neighbor seems like a neat guy, and maybe he’ll give you a job at his bookstore. V. Conclusion: Musings On The Identifiable Victim Effect I actually skipped the very beginning of Hreha’s article. I want to come back to it now. It begins: The last few years have been particularly bad for behavioral economics. A number of frequently cited findings have failed to replicate. Here are a couple of high profile examples: The Identifiable Victim Effect (featured in the workbooks I wrote with Dan Ariely and Kristen Berman in 2014)
Inline links: menu effects, Acceptable Losses: The Debatable Origins of Loss Aversion, The Loss of Loss Aversion: Will It Loom Larger Than Its Gain?, Acceptable Losses, Loss Of Loss Aversion, the first two pages, https://substackcdn.com/image/fetch/$s_!W80n!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7f832933-e9a1-4cf2-ba32-e4bc11ce681c_830x624.png, Kahneman and Tversky’s original paper, https://substackcdn.com/image/fetch/$s_!2VLm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F8c49b44a-8850-4ac8-9a88-3dd07718a648_397x305.png, Loss Aversion Has Moderators, But Reports Of Its Death Are Greatly Exaggerated, Alex Imas, Replicating patterns of prospect theory for decision under risk, They looked at the results of 126 randomized controlled trials, others, https://substackcdn.com/image/fetch/$s_!1gY_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5b127d7c-045a-4266-b515-a0e9201ce651_571x382.png, a previous discussion of growth mindset, the workbooks I wrote with Dan Ariely and Kristen Berman in 2014
Green is one of these five studies, and it does superficially find loss aversion. But Fishburn and Kochenberger have done something weird. They argue that “loss” and “gain” aren’t necessarily objective, and usually correspond to “loss relative to some reference frame” (so far, so good). In order to figure out where the reference frame is, they assume that the neutral point is wherever “something unusual happens to the individual’s utility function” (F&K’s words). So they shift the zero point separating losses and gains to wherever the utility function looks most interesting! After doing this, they find “loss aversion”, ie the utility curve changes its slope at the transition between the loss side and the gain side. But since the transition was deliberately shifted to wherever the utility curve changed slope, this is almost tautological. It isn’t quite tautological: it’s interesting that most of the utility curves had a sharp transition zone, and it’s interesting that the transition was in the direction of loss-aversion rather than gain-seeking. But it’s tautological enough to be embarrassing. Still, this is Fishburn and Kochenberger’s embarrassment, not Kahneman and Tversky’s. And Fishburn and Kochenberger included this study in their review alongside several other studies that didn’t do this to the same degree. Kahneman and Tversky just cited the review article. I don’t think citing a review article that does weird things to a study really qualifies as “systematic misrepresentation.” I guess I’m having a hard time figuring out how angry to be, because everything about Fishburn and Kochenberger is terrible. The average study in F&K includes results from 5-10 executives. But the studies are pretty open about the fact that they interviewed more executives than this, threw away the ones who gave boring answers, and just published results from the interesting ones. Then they moved the axes to wherever looked most interesting. Then they used all this to draw sweeping generalizations about human behavior. Then F&K combined five studies that did this into a review article, without protesting any of it. And then K&T cited the review article, again without protesting. I have to imagine that all of this was normal by the standards of the time. I have looked up all these people and they were all esteemed scientists in their own day. And I believe the evidence shows K&T summarized F&K faithfully. Shouldn’t they have avoided citing F&K at all? Seems like the same kind of question as “Shouldn’t Pythagoras have published his theorem in a peer-reviewed journal, instead of moving to Italy, starting a cult, and exposing his thigh at the Olympic Games as part of a scheme to convince people he was the god Apollo?” Yes, but the past was a weird place. As best I can tell, K&T’s citation of G&P agrees with the authors’ own assessment of their results. Their citation of F&K agrees with the reviewers’ assessment and with a charitable reading of most of the studies involved, although those studies are terrible in many ways which are obvious to modern readers. I would urge people interested in the whodunit question to read Kahneman and Tversky’s original paper. I think it paints the picture of a team very interested in their own results and in theory, and citing other people only incidentally, and in accordance with the scientific standards of their time. I don’t feel a need to tar them as “misrepresenters”. III. Okay, But Is Loss Aversion Real? Remember, all that is about the personal deficiencies of Kahneman and Tversky. Realistically there have been hundreds of much better studies on loss aversion in the forty years since they wrote their article, so we should be looking at those. Here Hreha cites Gal & Rucker: The Loss Of Loss Aversion: Will It Loom Larger Than Its Gain? It’s a great 2018 paper that looks at recent evidence and concludes that loss aversion doesn’t exist. But it’s a very specific, interesting type of nonexistence, which I think the Hreha article fails to capture. G&R are happy to admit that in many, many cases, people behave in loss-averse ways, including most of the classic examples given by Kahneman and Tversky. They just think that this is because of other cognitive biases, not a specific cognitive bias called “loss aversion”. They especially emphasize Status Quo Bias and the Endowment Effect. Status Quo Bias is where you prefer inaction to action. Suppose you ask someone “Would you bet on a coin flip, where you get $60 if heads and lose $40 if tails?”. They say no. This deviates from rational expectations, and one way to think of this is loss aversion; the prospect of losing $40 feels “bigger” than the prospect of gaining $60. But another way to think of it is as a bias towards inaction - all else being equal, people prefer not to make bets, and you’d need a higher payoff to overcome their inertia. Endowment Effect is where you value something you already have more than something you don’t. Suppose someone would pay $5 to prevent their coffee mug from being taken away from them, but (in an alternative universe where they lack a coffee mug) would only pay $3 to buy one. You can think of this as loss aversion (the grief of losing a coffee mug feels “bigger” than the joy of gaining one). Or you can think of it as endowment (once you have the coffee mug, it’s yours and you feel like defending it). These are really fine distinctions; I had to read the section a few times before the difference between loss aversion and endowment effect really made sense to me. Kahneman and Tversky just sort of threw all all this stuff out and saw what stuck and didn’t necessarily try super hard to make sure none of the biases they discovered were entirely explainable as combinations of some of the others. G&R think maybe loss aversion is. They do some clever work setting up situations that test loss aversion but not status quo or endowment - for example, offering a risky bet vs. a safer bet. Here they find no evidence for loss aversion as a separate force from the other two biases. Somewhere in this process, they did an experiment where they gave participants a quarter minted in Denver and asked them if they wanted to exchange it for a quarter minted in Philadelphia. 60% of people very reasonably didn’t care, but another 35% had grown attached to their Denver quarter, with only 5% actively seeking the novelty of Philadelphia. Psychology is weird. I understand why some people would summarize this paper as “loss aversion doesn’t exist”. But it’s very different from “power posing doesn’t exist” or “stereotype threat doesn’t exist”, where it was found that the effect people were trying to study just didn’t happen, and all the studies saying it did were because of p-hacking or publication bias or something. People are very often averse to losses. This paper just argues that this isn’t caused by a specific “loss aversion” force. It’s caused by other forces which are not exactly loss aversion. We could compare it to centrifugal force in physics: real, but not fundamental. Also, you can’t use this paper to argue that “behavioral economics is dead”. At best, the paper proves that loss aversion is better explained by other behavioral economic concepts. But you can’t get rid of behavioral econ entirely! The stuff you have to explain is still there! It’s just a question of which parts of behavioral econ you use to explain it. Complicating this even further is Mrkva et al, Loss Aversion Has Moderators, But Reports Of Its Death Are Greatly Exaggerated (h/t Alex Imas, who has a great Twitter thread about this). This is an even newer paper, 2019, which argues that Gal and Rucker are wrong, and loss aversion does have an independent existence as a real force. There are many things to like about this paper. Previous criticisms of loss aversion argue that most experiments are performed on undergrads, who are so poor that even small amounts of money might have unusual emotional meaning. Mrkva collects a sample of thousands of millionaires (!) and demonstrates that they show loss aversion for sums of money as small as $20. On the other hand, I’m not sure they’re quite as careful as G&R at ruling out every other possible bias (although I don’t have a great understanding of where the borders between biases are and I can’t say this for sure). The main point I want to make is that all the scientists in this debate seem smart, thoughtful, and impressive. This isn’t like social priming experiments where one person says a crazy thing, nobody ever replicates it at scale, and as soon as someone tries the whole thing collapses. These have been replicated hundreds of times, with the remaining arguments being complicated semantic and philosophical ones about how to distinguish one theory from a very slightly different theory. If that takes replicating your result on a sample of thousands of millionaires, people will gather a sample of thousands of millionaires and get busy on the replication. Just overall really impressive work. I don’t feel qualified to take a side in the G&R vs. Mkrva debate, but both teams make me really happy that there are smart and careful people considering these questions. And this is just a drop in the bucket. Alex Imas also links Replicating patterns of prospect theory for decision under risk, which says: Though substantial evidence supports prospect theory, many presumed canonical theories have drawn scrutiny for recent replication failures. In response, we directly test the original methods in a multinational study (n = 4,098 participants, 19 countries, 13 languages), adjusting only for current and local currencies while requiring all participants to respond to all items. The results replicated for 94% of items, with some attenuation. Twelve of 13 theoretical contrasts replicated, with 100% replication in some countries. Heterogeneity between countries and intra-individual variation highlight meaningful avenues for future theorizing and applications. We conclude that the empirical foundations for prospect theory replicate beyond any reasonable thresholds. Beyond any reasonable thresholds! IV. Do Nudges Work? or, How Small Is Small? Continuing through the Hreha article: For a number of years, I've been beating the anti-nudge drum. Since 2011, I've been running behavioral experiments in the wild, and have always been struck by how weak nudges tend to be. In my experience, nudges usually fail to have *any* recognizable impact at all. This is supported by a paper that was recently published by a couple of researchers from UC Berkeley. They looked at the results of 126 randomized controlled trials run by two "nudge units" here in the United States. I want you to guess how large of an impact these nudges had on average... 30%? 20%? 10%? 5%? 3%? 1.5%? 1%? 0%? If you said 1.5%, you'd be right (the actual number is 1.4%, but if I had written that out you would have chosen it because of its specificity). According to the academic papers these nudges were based upon, these nudges should have had an average impact of 8.7%. But, as you probably understand by now, behavioral economics is not a particularly trustworthy field. I actually emailed the authors of this paper, and they thought the ~1% effect size of these interventions was something to be applauded—especially if the intervention was cheap & easy. Unfortunately, no intervention is truly cheap or easy. Every single intervention requires, at the very minimum, administrative overhead. If you're going to do something, you need someone (or some system) to implement and keep track of it. If an intervention is only going to get you a 1% improvement, it's probably not even worth it. Uber infamously had a team of behavioral economists working on its product, trying to “nudge” people in the right direction. Relatedly, Uber makes $10 billion in yearly revenue. If they can “nudge” people to spend 1% more, that’s $100 million. That’s not much relative to revenue, but it’s a lot in absolute terms. In particular, it pays the salary of a lot of behavioral economists. If you can hire 10 behavioral economists for $100,000 a year and make $100 million, that’s $99 million in profit. Or what if you’re a government agency, trying to nudge people to do prosocial things? There are about 90 million eligible Americans who haven’t gotten their COVID vaccine, and although some of them are hard-core conspiracy theorists, others are just lazy or nervous or feel safe already. (source) Whoever decided on that grocery gift card scheme was nudging, whether or not they have an economics degree - and apparently they were pretty good at it. If some sort of behavioral econ campaign can convince 1.5% of those 90 million Americans to get their vaccines, that’s 1.4 million more vaccinations and, under reasonable assumptions, maybe a few thousand lives saved. Hreha says that: Every single intervention requires, at the very minimum, administrative overhead. If you're going to do something, you need someone (or some system) to implement and keep track of it. If an intervention is only going to get you a 1% improvement, it's probably not even worth it. This depends on scale! 1% of a small number isn’t worth it! 1% of a big number is very worth it, especially if that big number is a number of lives! A few caveats. First, a small number only matters if it’s real. It’s very easy to get spurious small effects, so much so that any time you see a small effect you should wonder if it’s real. I’m ready to be forgiving here because behavioral economics is so well-replicated and common-sensically true, but I wouldn’t blame anyone who steers clear. Second, Hreha says: To be honest, you can probably use your creativity to brainstorm an idea that will get you a 3-4% minimum gain, no behavioral economics "science" required. Which leads me to the final point I'd like to make: rules and generalizations are overrated. The reason that fields like behavioral economics are so seductive is because they promise people easy, cookie-cutter solutions to complicated problems. Figuring out how to increase sales of your product is hard. You need to figure out which variables are responsible for the lackluster interest. Is the price the issue? Is the product too hard to use? Is the design tacky? Is the sales organization incompetent? Is the refund/return policy lacking? etc. Exploring these questions can take months (or years) of hard work, and there's no guarantee that you'll succeed. If, however, a behavioral economist tells you that there are nudges that will increase your sales by 10%, 20%, or 30% without much effort on your part... Whoa. That's pretty cool. It's salvation. Thus, it's no surprise that governments and companies have spent hundreds of millions of dollars on behavioral "nudge" units. Unfortunately, as we've seen, these nudges are woefully ineffective. Specific problems require specific solutions. They don't require boilerplate solutions based on general principles that someone discovered by studying a bunch of 19 year old college students. However, the social sciences have done a good job of convincing people that general principles are better solutions for problems than creative, situation-specific solutions. In my experience, creative solutions that are tailor-made for the situation at hand *always* perform better than generic solutions based on one study or another. Hreha is a professional in this field, so presumably he’s right. Still, compare to medicine. A thoughtful doctor who tailors treatment to a particular patient sounds better (and is better) than one who says “Depression? Take this one all-purpose depression treatment which is the first thing I saw when I typed ‘depression’ into UpToDate”. But you still need medical journals. Having some idea of general-purpose laws is what gives the people making creative solutions something to build upon. (also, at some point your customers might want to check your creative solution to see whether it actually gives a “3-4% minimum gain, no behavioral economics required”, and that would be at least vaguely study-shaped.) Third, everyone who said nudging had vast effects is still bad and wrong. Many of them were bad and wrong and making fortunes consulting for companies about how to implement the policies they were claiming were super-powerful. This is suspicious and we should lower our opinion of them accordingly. In a previous discussion of growth mindset, I wrote: Imagine I claimed our next-door neighbor was a billionaire oil sheik who kept thousands of boxes of gold and diamonds hidden in his basement. Later we meet the neighbor, and he is the manager of a small bookstore and has a salary 10% above the US average... Should we describe this as “we have confirmed the Wealthy Neighbor Hypothesis, though the effect size was smaller than expected”? Or as “I made up a completely crazy story, and in unrelated news there was an irrelevant deviation from literally-zero in the same space”? All the people talking about oil sheiks deserve to get asked some really uncomfortable questions. And a lot of these will be the most famous researchers - the Dan Arielys of the world - because of course the people who successfully hyped their results a lot are the ones the public knows about. Still, the neighbor seems like a neat guy, and maybe he’ll give you a job at his bookstore. V. Conclusion: Musings On The Identifiable Victim Effect I actually skipped the very beginning of Hreha’s article. I want to come back to it now. It begins: The last few years have been particularly bad for behavioral economics. A number of frequently cited findings have failed to replicate. Here are a couple of high profile examples: The Identifiable Victim Effect (featured in the workbooks I wrote with Dan Ariely and Kristen Berman in 2014)
Inline links: Kahneman and Tversky’s original paper, The Loss Of Loss Aversion: Will It Loom Larger Than Its Gain, https://substackcdn.com/image/fetch/$s_!2VLm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F8c49b44a-8850-4ac8-9a88-3dd07718a648_397x305.png, Loss Aversion Has Moderators, But Reports Of Its Death Are Greatly Exaggerated, Alex Imas, Replicating patterns of prospect theory for decision under risk, They looked at the results of 126 randomized controlled trials, others, https://substackcdn.com/image/fetch/$s_!1gY_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F5b127d7c-045a-4266-b515-a0e9201ce651_571x382.png, a previous discussion of growth mindset, the workbooks I wrote with Dan Ariely and Kristen Berman in 2014
People take various policy implications from this (maybe “life sentences” should end at 65, since incapacitation is unlikely to help much after that). But here we’re interested in its potential to confound studies. A 20 year old who gets 5 years in prison is released at 25 - still young! - but a 20 year old who gets 10 years in prison is released at 30 - too old to be leaping on rooftops and running from cops. The National Sentencing Commission understands this problem, and matches the experimental and control groups by age at release. But this introduces a new bias - now they’re different ages when they start committing crimes. Might a person who starts crime at 15 be a more disturbed and committed criminal than one who starts at 20? Seems plausible. I think this might be responsible for a lot of the seemingly positive effect of sentences > 5 years. There are dozens of other studies on this topic, all hotly debated, so even in this part I’m only going to list a few highlights. Still, these are: Green and Winik (2010). They use random judge assignment, ie look at criminals with similar crimes who got lenient/strict judges and so shorter/longer sentences. They find that the total difference in rearrests is indistinguishable from zero. But the length of time in which they were measuring rearrests includes the time the offenders were in jail, so this is saying that incapacitation plus aftereffects was zero (plus or minus a margin of error), meaning that aftereffects must be detrimental and large enough to cancel out the benefits of incapacitation, just as Roodman claims. But this study looked at minor crimes where sentences were measured in months, so I think this matches our previous suspicion that aftereffects might be detrimental in short sentences but neutral-to-beneficial in longer ones. Roach and Schanzenbach (2015) More random judge assignment, this time in Seattle. They find that each month of longer sentence decreases future reoffending by one percentage point. Most of these sentences are short, so this contradicts our working theory that lengthening short sentences increases crime but lengthening long ones decreases it. Neither Berger nor Roodman really want to take this study too seriously; Berger objects that it’s an unusual study population (everyone entered a guilty plea), and Roodman objects that the judge selection might not have been truly random. Rhodes (2018) is a matching study - it artificially tries to create groups of prisoners who are as similar as possible except that one group got longer sentences. Its big advantage is that it has some people serving moderately long sentences (a few years), getting us out of the few-month range investigated by some of the other studies. It finds a mild beneficial effect of longer sentences: This study provides no evidence that an offender’s criminal trajectory is negatively affected – that is, that criminal behavior is accelerated – by the length of an offender’s prison term. If anything, longer prison terms modestly reduce rates of recidivism beyond what is attributable to incapacitation. This “treatment effect” of a longer period of incarceration is small. The three-year base rate of 20% recidivism is reduced to 18.7% when prison length of stay increases by an average of 5.4 months. We are inclined to characterize this as a benign, close to neutral effect on recidivism. What Do Our Experts Think? As mentioned above, these are only a few of the very many studies on this topic, and I’ve only given the briefest summary of each. Due to the complexity of this literature, I’m relying more than usual on the opinion of the expert reviewers. Berger (pro-longer-sentences) says: Considering the rigorous research published since the Nagin et al. (2009) review, the literature regarding length of stay on recidivism is still somewhat inconsistent, with many studies claiming no recidivism effects and some showing that increased prison length reduces recidivism slightly. However, just like the rest of the research examined thus far, the study methodologies vary in terms of their limitations, which could explain some of the mixed results [...] At present, there is no substantial evidence that a criminogenic effect exists in the aggregate. Thus, it remains unclear whether criminogenic effects exist, and if so, under what circumstances...Among the substantial number of published studies with varying methodologies, not one has found a large aggregate-level criminogenic effect. Roodman (pro-shorter-sentences) says: The preponderance of the evidence says that incarceration in the US increases crime post-release, and enough over the long run to offset incapacitation. A quartet of judge randomization studies (Green and Winik in Washington, DC; Loeffler in Chicago; Nagin and Snodgrass in Pennsylvania; Dobbie, Goldin, and Yang in Philadelphia and Miami) put the net of incapacitation and incarceration aftereffects at about zero. In parallel, Chen and Shapiro find that harsher prison conditions—making for incarceration that is harsher in quality rather than quantity—also increases recidivism. Gaes and Camp concur, though less convincingly because in their study harsher incarceration quality went hand in hand with lower incarceration quantity. Mueller-Smith sides with all these studies and goes farther, finding modest incapacitation and powerful, harmful aftereffects in Houston; but modest hints of randomization failure accompany those results. Some studies dissent from the majority view that incarceration is criminogenic. Roach and Schanzenbach find beneficial aftereffects in Seattle—a result that is also subject to some doubt about the quality of randomization. Bhuller et al. make a more compelling case that incarceration reduces crime after—in Norway. Berecochea and Jaman, one of the few truly randomized studies in this literature, also looks more likely right than wrong, and is also somewhat distant in its setting, early-1970s California. And there are the two Georgia studies, which upon reanalysis no longer point to beneficial aftereffects, but still do not demonstrate harmful ones either. Aftereffects must vary by place, time, and person. But the first-order generalization that best fits the credible evidence is that at the margin in the US today, aftereffects offset in the long run what incapacitation does in the short run. Nagin (neutral, tie-breaker) says: Compared with noncustodial sanctions, incarceration appears to have a null or mildly criminogenic effect on future criminal behavior. This conclusion is not sufficiently firm to guide policy generally, though it casts doubt on claims that imprisonment has strong specific deterrent effects. What conclusions do we draw from these studies of the dose-response relationship between time served and reoffending? The one experimental study is suggestive of a preventive effect, but that effect may be attributable to incapacitation. Two of the matching studies point weakly to a criminogenic type dose-response relationship, but both are extremely dated. The Loughran et al. (2008) study suggests a possible criminogenic effect of placement but finds no linkage between time served and reoffending. We draw no conclusions from the results of the regression studies. Not only are results extremely varied, but more importantly all of the studies suffer from a fundamental analytical flaw. This flaw relates to the potential sensitivity of regression- based studies to specification errors in the model of the relationship of age and offending rate. In other words: Berger and Nagin think evidence is weak and it’s kind of a wash and maybe there are slight criminogenic effects; Roodman thinks there are strong criminogenic effects that (on the current margin) are sizeable enough to approximately cancel out the benefit from incapacitation. So What’s Up With Roodman? At the risk of repeating myself: this is the question upon which this whole essay hinges. Everyone agrees that the beneficial effects of deterrence are real but small. Everyone agrees that the beneficial effects of incapacitation are real and large. Everyone except Roodman agrees that aftereffects range from slightly beneficial to slightly detrimental, for a net effect of incarceration significantly decreasing crime. Only Roodman says that aftereffects are large and detrimental, for a net effect of incarceration having no effect on crime. So where does Roodman disagree with everyone else? My impression is that the main difference is that Roodman gives more weight to certain judge selection studies. These find that being randomly assigned to a lenient vs. strict judge (and therefore on average getting a short vs. long sentence) doesn’t change rearrest rates after X years from the time the sentence started. This X year period includes both the time spent serving the sentence, and the time after release when aftereffects might materialize - ie they include both incapacitation and aftereffects. Since these studies fail to find any net effect, and incapacitation effects must be beneficial and large, Roodman concludes that aftereffects must be detrimental and large. Then he reanalyzes several of the other studies that other people use to demonstrate no or beneficial aftereffects, and finds them less convincing after reanalysis. So who is right? Roodman gets his strongest evidence from studies of short sentences vs. shorter sentences (eg going from 0 to 1 years, or 1 to. 2 years). These are naturally where we would expect the fewest benefits from incapacitation. But they’re also where we would common-sensically expect the worst aftereffects. Someone going from zero prison to one year in prison has had their life, career, and relationships profoundly changed, in a way that someone going from ten years in prison to eleven years hasn’t. This is consistent with the National Sentencing Commission study above. They found that aftereffects trended worse the shorter the sentences got, but didn’t investigate any sentences shorter than 2-3 years. If the trend continues, sentences shorter than that could have aftereffects > incapacitation. So maybe Roodman is right about shorter sentences, and everyone else is right about longer sentences. Going from a month to a year in prison is so disruptive and criminogenic that it risks canceling the benefits of eleven extra months of incapacitation. But going from ten years to eleven years mostly just gives you the incapacitation. Marginal Revolution This highlights a problem with all of these studies: we can only talk about particular margins. Imagine a country which currently incarcerates zero people, trying to decide whether to move up to a policy of incarcerating one person. If you only incarcerate one person, it will be the baddest dude in the whole country. That guy really needs to be behind bars! And we’re not worried about turning him into a hardened criminal, because he’s already maximally bad. Here it’s obvious that benefits outweigh costs. Now imagine a country which incarcerates 50% of its population, trying to decide whether to move up to 50% + 1. At this point, you’re imprisoning someone who went a few miles over the speed limit. You gain no benefits from incapacitation (he wasn’t going to commit any crimes anyway), but you stand to lose a lot from aftereffects (he’s probably a totally normal law-abiding citizen, so there’s a very high risk of ruining his life and turning him into a more hardened criminal). Here it’s obvious that costs outweigh benefits. So the question isn’t “do the costs of prison outweigh benefits?”, but rather “at what point between incarcerating 0% and 50% of people does the cost of imprisoning one more person start outweighing the benefits?”, or even “at the current US incarceration rate of 0.75%, does the cost of imprisoning one more person outweigh the benefits?” In some sense, this is what we’ve been investigating the whole time - all of these studies are being conducted at the current margin. But this hides big differences between them. We’ve already seen that European studies get stronger results than American studies. That’s because European countries have incarceration rates of ~0.05%, compared to America’s ~0.75%. In theory, Europeans countries’ incarceration rates are lower because they have less crime. But I notice that the European countries we’re talking about here all have high recent new immigrant populations, and in Europe these groups commit more crimes per person than natives. So it’s possible that Europe is still adjusting to being a high-crime continent, whereas America has already adjusted by raising incarceration rates. So one possible conclusion is that the benefits of incarceration strongly outweigh costs in Europe. I think this is clearly true by American values - we seem to care more about preventing crime, and be less horrified by imprisonment, than the average European. But there are many different margins even within America. Louisiana’s incarceration rate is >1%; Massachusetts is <0.25%. Some of the variance reflects the criminality of each state’s population, but other variance reflects the values of each state’s voters and policy-makers. We haven’t been keeping great track of which state each of our studies comes from, but plausibly the marginal prisoner in Massachusetts is a badder dude than the marginal prisoner in Louisiana, and releasing him is more likely to have costs > benefits. Margins also differ across eras. US incarceration ranged from 0.2% in 1970 to 0.95% in 2007 to about 0.75% today. Our studies cover this entire time period. This is probably why Levitt found stronger incapacitation effects (studying the 1970s) than Owens or Lofstrom+Raphael (studying the 2000s). Finally, there are the margins across sentences we discussed earlier. Going from zero years in prison to one year is a bigger deal than going from ten to eleven. When we examine our original question - does extending the average prisoner’s sentence for one year substantially decrease crime, we find that there’s no single answer - it depends where we are on all of these margins. Roodman’s skeptical position is most plausible for shorter sentences in high-incarceration areas, and Berger’s pro-prison position is most plausible for longer sentences in low-incarceration areas. So Why Do People Keep Saying That Prison Doesn’t Decrease Crime? We began with the observation that criminologists tend to deny that prison decreases crime. We now know why Roodman thinks this: he idiosyncratically believes that aftereffects equal (and so cancel out) incapacitation. But nobody else has even gotten this far. So what’s everyone else’s position? The Vera Institute is an anti-incarceration think tank. They have a policy paper titled The Incarceration Myth: More Incarceration Will Not Decrease Crime. It says: There is a very weak relationship between higher incarceration rates and lower crime rates. Although studies differ somewhat, most of the literature shows that between 1980 and 2000, each 10 percent increase in incarceration rates was associated with just a 2 to 4 percent lower crime rate. This is just taking the (real, positive) effect of incarceration on crime, and calling it “very weak”. Research shows that each additional increase in incarceration rates will be associated with a smaller and smaller reduction in crime rates. We saw above that this is true, but I find it annoying to mention here in this kind of advocacy context - it’s also true of everything else in the world! When the Vera Institute publishes anti-mass-incarceration white papers, the 500th white paper will be less influential than the first. If I claimed that “research showed” this, and so they should stop publishing anti-mass-incarceration white papers, they would look at me like I’d gone insane. Get a life. The weak association between higher incarceration rates and lower crime rates applies almost entirely to property crime. Research consistently shows that higher incarceration rates are not associated with lower violent crime rates. This is sort of true. Research finds a stronger effect of incarceration on property crimes than violent crimes, although Levitt does find a violent crime effect of minus one violent crime per incarceration-year. Partly this is because violent crimes are rarer than property crimes, and so studies are underpowered to find them. And partly it’s because most studies are done on mass releases of prisoners, where (for example) the state has to release 25% of the prison population to decrease overcrowding, but they get to choose which 25% - and states are smart enough not to release the murderers and psychos. Still, if Vera Institute’s preferred decarceration policy is also smart, then it won’t release the murderers and psychos either, and this point will stand. So my interpretation of Vera Institute is that they’re making some good points about ways that incarceration isn’t an infinitely powerful cure-all, but that it’s deceptive to summarize them as “incarceration doesn’t decrease crime”. What about other groups? Prison Policy Institute has a list of “crime myths”. Myth #7 is that “Harsh punishments deter crime, making us safer”. They write: Many people mistakenly believe that long sentences, paired with austere and even brutal prison conditions, will have a deterrent effect on crime. But research has consistently found that harsher sentences do not serve as effective “examples” that would prevent new people from committing serious crimes. In 2016, the National Institute of Justice summarized the research on deterrence, finding that prison sentences, and especially long sentences, do little to deter future crime Here they’re using “deterrence” in the strict sense (that is, in a way that doesn’t count incapacitation), noting that it’s small, and rounding off “small” to “zero”. I’ve looked at some other sites and think tanks that claim to have arguments against the “myth” that prison prevents crime, and they’re all using these same two tricks. Either they ignore incapacitation and focus only on deterrence + aftereffects. Or they imagine some hypothetical prison super-fan who believes that incapacitation is infinitely effective, prove that it’s less effective than this, declare victory over this fake opponent, and then summarize their win as “prison has no effect”. What Are The Costs Vs. Benefits Of Prison? So a more honest version of the claim that “prison has no effect on crime” might be “the effect of prison on crime is weak”. How weak is it? We already saw one way to answer this: it probably prevents on average 7 crimes/year (6 property + 1 violent), minus some amount, especially for short sentences, if you believe in criminogenic aftereffects. For the shortest sentences at the highest-incarceration margins, it’s possible for the effect to be zero or less. Another way to answer is with elasticities. If we increase in incarceration rate 10%, how much crime do we prevent at the current margins? Levitt estimates 3%, Cohen finds 0.5-7%, and Dhodnt finds -2% (ie prison increases crime) but this is an outlier. Spelman writes: Our best estimate of elasticity is “in the neighborhood of [3% drop in crime per 10% increase in incarceration]” but “[a]ny figure between [2% and 4%] can be defended, and we should not be too surprised to find that the result is anywhere between [1% and 5%]” This broadly agrees with our numbers from Sweden, California, and El Salvador above. Small increases in incarceration cause small decreases in crime. Large increases in incarceration cause large decreases in crime. If you doubled the incarceration rate, locking up an extra million people, then crime would decrease ~30% at current US margins (maybe less, because you’re shifting the margin and getting diminishing returns). Would more prison be good or bad? We’d need to do a cost-benefit analysis. Surprisingly, Roodman does the best work here: after making his claim that costs and benefits mostly cancel out, he admits that most people won’t believe him, and tries to estimate the effect size in the “devil’s advocate” case where everyone else is right and he is wrong. He starts with our previous finding that incapacitation prevents ~7 crimes a year, and returns to the incapacitation studies to see what types of crime are most affected. Then he adjusts for the low level of aftereffects that everyone else believes in. I’ve redone his results for clarity. This table shows the total number of each type of crime prevented by keeping the marginal prisoner in jail for one extra year: Why does prison prevent negative robberies? Roodman is subtracting the small aftereffects found by other researchers, and the data for rare crimes is noisy, so probably this is just an artifact. I round this to zero for the full analysis. If we’re trying to calculate the costs vs. benefits of imprisonment, we need to put a cost on all these crimes. This is hard to quantify - a robber may steal $100 worth of goods, but valuing his crime at $100 in costs ignores the disutility of (eg) living in fear Roodman uses two methods: first, he values a crime at the average damages that courts award to victims, including emotional damages. Second, he values it at what people will pay - how much money would you accept to get assaulted one extra time in your life? These estimates still exclude some intangible costs, like the cost of living in a crime-ridden community, but it’s the best we can do for now. Here are his answers (I’ve taken the geometric mean of the two methods): So one extra year of incarcerating the marginal criminal saves society $44,000 in crimes prevented. Now we add in the opposite side of the ledger: the costs of incarceration: According to Roodman, the average prisoner costs the state $31,000 per year. He got his data from 2008, and it’s since ballooned to about $60,000, but we’ll keep his number so that everything is from the same time period. (also, as always, California is more expensive - here it’s $120,000) Roodman also adds in the costs to the prisoner. He uses some surveys to value the disutility of the suffering caused by a year in prison at $50,000; additionally, the prisoner loses about $16,000 in earning potential. The end result: if you don’t count the costs to the prisoner themselves, and you don’t use the more modern number, and you’re not in an expensive state like California, then the marginal incarceration-year saves society about $13,000. If you do count those things, or you’re in an expensive state, the costs far outweigh the benefits. Realistically, most people won’t care about analyses like this. They’ll be more interested in the unquantifiable costs and benefits, including: The “benefit” of feeling like justice has been done and an evil deed has been avenged.
Inline links: Green and Winik (2010), Roach and Schanzenbach (2015), Rhodes (2018), incarceration rates are lower because they have less crime, The Vera Institute, The Incarceration Myth: More Incarceration Will Not Decrease Crime, a list of “crime myths”, Cohen, Dhodnt, Spelman, https://substackcdn.com/image/fetch/$s_!FI6X!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6865dbc9-fe86-4467-ab7b-829c1425a81b_189x133.png, https://substackcdn.com/image/fetch/$s_!X8oH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13ddcc41-7e70-4a3c-bd25-c046f2bf811d_404x177.png, https://substackcdn.com/image/fetch/$s_!ltt7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7484d67f-e24d-4511-ac8f-f7b20efed758_458x273.png
1: When was the vibecession? 2: Is the vibecession just sublimating cultural complaints? 3: Discourse downstream of the Mike Green $140K poverty line post 4: What about other countries? 5: Comments on rent/housing 6: Comments on inflation 7: Comments on vibes 8: Other good comments 9: The parable of Calvin’s grandparents 10: Updates / conclusions
Obviously nothing real changes the exact second a new president is inaugurated, so people must be using questions about the economy to express their overall happiness about the state of the world. Alex asks whether increasing political polarization could make this worse. Both parties’ extreme factions share a tendency to treat the country as controlled by a hegemonic conspiracy of their enemies - the woke coastal elite Soros cosmopolitan establishment, or the neoliberal fat cat Koch Brothers tech oligarch blob. Does this mean everyone is getting some multiple of the “other party’s president is in power” effect all the time? 3: Discourse Downstream Of The Mike Green $140K Poverty Line Post … Shovacklerod writes: Scott have you read Mike Green’s viral post on this? His main argument is that the poverty line is miscalculated, but in context of declining middle class sentiments— The more interesting thesis is that there exists a “valley of death” where two parents in the workforce need a combined ~$140k salary otherwise the cumulative “participation costs” of a fast modern society (for example a phone plan or child care) make year-over-year capital accumulation near impossible. I haven’t, but other commenters suggest reading responses, including Noah Smith’s The $140,000 Poverty Line Is Very Silly, Jeremy Horpedahl’s The Poverty Line Is Not $140,000, and Tyler Cowen’s The Myth Of The $140,000 Poverty Line. Most of these focus on Green’s explicit errors - for example, he gets most of his cost-of-living numbers from Essex, NJ, an especially rich county, then compares them to average earnings. Correct half a dozen things like this, and the real poverty line is probably somewhere between $35K - $60K. The percent of Americans below this line continues to decline every year, as it has for decades. Green finally pseudo-apologized, lambasting the “mockery machine” of the “cognitive elite” but admitting that his post “was never intended to go viral and was written for my existing audience that tends to be pretty understanding that I don’t do this for a living, but rather as PART of my living” Still, many people took Green’s article as a starting point to contribute to the Vibecession discourse, so let’s go over the ones that touch on our topic in more detail. Lincicome titles his response The $140,000 Poverty Line Is Wrong, So Why Does It Feel Right?, and blames Baumol’s cost disease: As the Financial Times’ John-Burns Murdoch just detailed, Americans’ overall cost of living has improved over time, but certain highly visible and socially desirable services have become more expensive. That’s not a conspiracy against the middle class but instead just Baumol at work: “[A]s countries develop economically, the same productivity growth that drives down the cost of tradeable goods causes the cost of in-person services to balloon. Wages in sectors like healthcare and education that require intensive face-to-face labour, and have slow (if any) productivity growth, are forced upwards in order to attract workers who would otherwise opt for high-paying work in more productive sectors. The result is that even if people keep consuming the exact same basket of goods and services, as living standards in their country increase they will find more and more of their spending is going on essential services.” Sectors where productivity grows slowly and prices outpace inflation—health care, education, child care, personal services, housing (construction), etc.—happen to be the same ones that middle-class families notice most and that signal social status. As we’ve all gotten richer, moreover, these services have transitioned from luxuries to expectations. Throw in the hedonic treadmill and the fact that you can’t price-shop schools or hospitals the way you can TVs, and public alarm is all but inevitable. I’m suspicious of including “housing (construction)” on this list - couldn’t you use the same argument to reclassify any manufactured good as a service good? - but the rest of these are well-taken. Still, did Baumol or the other economists who first discussed the effect in the 1960s predict it would make people feel like things were outright worse, as opposed to just getting better less than would be expected from raw productivity numbers? Seems strange. Also, hasn’t the Baumol effect been basically constant since at least the Industrial Revolution? And isn’t the Vibecession only 5 - 20 years old? Matt Bruenig has his own response to Green, Why Do People Feel Like They’re Falling Behind? He bases his argument around this graph: …which is just making the common-sense point that, as society shifts from one-income to two-income families, the husband’s share of family income drops from ~100% to ~50%. So, Bruenig argues, if everyone is trying to keep up with the Joneses, and the Joneses are a dual-earner family, then this single working man has gone from making 100% of his comparison point, to making only 50%. This is a cool potential cognitive bias, but is anyone really making this mistake? Vibecession complaints hardly seem limited to men in traditional one-earner households wondering why they’re not making as much as the neighbors whose wife is a fancy lawyer. My impression is that they include both two-earner families who still feel like they’re falling behind, and (most of all) young singles who are comparing themselves to their young single friends where this issue never comes up in the first place. Matt Yglesias uses a similar strategy in You Can Afford A Tradlife. This is what they took from you. They never should have passed the ‘Make It Illegal To Wear Hair Gel And Marry A White Woman Act' back in 1959! He argues that the reason most wives work these days isn’t because we’re poorer (and they have to work to survive), but because we’re richer (and so wives can make so much money working outside the home that the opportunity cost is too high to pass up). A single earner could still support a family on a 1950s lifestyle. It would just feel like a failure, because we don’t realize how much worse than 1950s lifestyle was compared to our current conditions. The article’s paywalled, but you can get a pretty good sense of the argument from these paragraphs. After determining that the median man makes about $80,000/year, he writes: Let’s say our $80,000-a-year man is living in the Jacksonville area. The Department of Housing and Urban Development calculates what are called Fair Market Rents for each American metro — this means the 40th percentile rent for a home with any given set of characteristics. They say F.M.R. for a three-bedroom home in the Jacksonville area is $2,163. That comes out to about 30 percent of Mr. Median’s annual income. Can you really get a place to live for that little? Here’s a lovely three-bedroom home in the East Arlington neighborhood for $2,020 a month, and it’s zoned for an elementary school with a 10-out-of-10 ranking from GreatSchools. It’s true that 1,617 square feet is on the small side for, say, a family of five in the contemporary United States. But the average size of a new single family home was 1,289 square feet in 1960 and 1,500 square feet in 1970. Two of your kids are going to need to share a bedroom, but that’s how people lived back in the day. There’s more to life than housing, of course, but I started there because that’s the largest item in a household budget. Durable goods like furniture, cars, and appliances have all become better and more affordable since the mid-1960s. That’s partially offset by rising prices for things like college tuition, child care, and health care. But in the 1960s, most young people didn’t go to college. The way health insurance works, you only need one worker in your family to get a job-based health plan. And of course, with your wife serving as a full-time homemaker, you don’t need to worry about child care expenses. The big thing is that, with a larger family, you literally have a bunch of mouths to feed. But the model here is to replicate how people actually lived in the mid-1960s, which is that they dined out much less frequently and also spent a much larger share of their total income on food. When I try to retrace this, it seems possible, but barely. I imagined doing this in Sacramento, to be near family. Suppose I make $80K pretax = $6.6K/month pretax = $5K per month posttax. A cheap 3-bedroom house on a nice-enough block is $2200 mortgage, assume $3K after property taxes etc. A cheap new car is $350/month. Food can be arbitrarily low if you’re willing to eat rice all the time, but let’s say $250/month. CoveredCalifornia offered my family of four healthcare for $600/month. So top four expenses take $4200/month of the $5000/month pretax income. I don’t know; seems tough. I would like to see a more thorough breakdown of an average 2026 vs. 1956 man’s likely budget. There are also some areas where it’s harder to separate genuine declines from rising expectations. Most people in the 1950s didn’t have health insurance. Was that because they accepted lower levels of health, or because medical care was cheaper, and easy enough to afford out-of-pocket? Probably some very complicated combination of both. And it might be impossible to get certain kinds of 1950s medical care today, i.e. a bed in a cheap low-quality shared hospital room. (some of the best discussion around this came from the response to Elizabeth Warren’s The Two-Income Trap, see eg Matt Bruenig here) Still, I find this tangential to the main point. Yes, a few conservatives complain that it’s hard to have a single-income family. But most vibecession complaints come from singles or dual-earner households! 4: What About Other Countries? … Dionysus writes: Did you know that China also has a vibecession? If even China can’t regulate social media heavily enough to prevent this phenomenon, how can any liberal society possibly hope to? The link goes to an NYT article, which includes quotes like: Using apps like RedNote and Douyin, people are reviving memories of the 2000s and the early 2010s with photos of daring outfits, upbeat songs and vintage TV commercials, all of which, in different ways, evoke a time in China that pulsed with optimism. “The music back then throbbed with exuberance, brimming with the sense that the future could only get brighter,” a middle-aged man said in a RedNote video. “Today’s lyrics begin with lines like, ‘We’re trying our best to survive.’” And The boom-time beauty meme is the latest expression of a Gen Z counterculture born of disillusionment, the recognition that they may be the first generation in half a century unlikely to surpass their parents’ standard of living, no matter how hard they try. Over the past five years, this quiet resistance has taken many forms. It began with “lying flat,” a refusal to join the rat race. Some chose to pursue the “run philosophy,” or emigrating in search of freedom and brighter prospects. Others declared themselves the “last generation,” vowing not to have children. Still others embraced “let it rot,” giving up on difficult goals rather than battling for uncertain rewards. To show they could care less about career prospects, many took to wearing “gross outfits” at work. This is especially crazy in China, where GDP per capita is now ten times what it was back during the “Boom Years” that everyone reminisces about. This might be the smoking gun that people’s economic beliefs are totally unmoored from how rich they are. The Chinese story has an obvious moral: people care about growth rate more than level. But even this doesn’t work for America - our Vibecession doesn’t correspond to a period of unusually low growth. machine_spirit writes: It’s interesting to compare it to Europe as the control group. Unlike the US, whose economy muddled through just fine during the last decade, we are currently experiencing a massive economic decline that could soon turn into a full-blown collapse. And yet, outside of debates about immigration or foreign policy especially regarding Ukraine you don’t really hear the same level of rancour about ‘things being bad’ in the local media. I’m surprised to hear this. I hear many economic complaints from Europeans, but I suppose this passes through my own American filter bubble which is incentivized to talk about economic hardship for its own American reasons. Golden Feather writes: I am an Italian currently living in the US. My main guesses would be: Right-wing parties control a supermajority of TV and print media. They have also been in the govt most of the time, which means they control the state TV and have an interest in presenting things as rosey. The much older population makes the internet less relevant for public sentiment. Even in the few years where they were at the opposition, they mostly focused on immigration and crime to rile up popular sentiment, I guess because the population is older, their voters even moreso, so they care more about that than about the economy
Inline links: writes, The $140,000 Poverty Line Is Very Silly, The Poverty Line Is Not $140,000, The Myth Of The $140,000 Poverty Line, The $140,000 Poverty Line Is Wrong, So Why Does It Feel Right?, Baumol’s cost disease, detailed, Why Do People Feel Like They’re Falling Behind?, https://substackcdn.com/image/fetch/$s_!E2rN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff94f8851-17f5-4a98-b1f8-349f568d23bb_1024x800.png, You Can Afford A Tradlife, https://substackcdn.com/image/fetch/$s_!ljt5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b1176e8-f932-430a-b7e1-c699a5ecf15c_581x479.png, They say, lovely three-bedroom home in the East Arlington neighborhood for $2,020 a month, average size of a new single family home was 1,289 square feet in 1960 and 1,500 square feet in 1970, how people actually lived in the mid-1960s, dined out much less frequently, see eg Matt Bruenig here, writes, China also has a vibecession?, lying flat, run philosophy, gross outfits, writes, writes