Polymarket

Article

Polymarket is a recurring organization in the Astral Codex Ten archive, appearing 44 times across 44 issues between February 08, 2021 and March 03, 2026. The archive places it in contexts such as “Polymarket is another cryptocurrency-based prediction market”; “getting money into Polymarket has gone from impossible to merely annoying”; “Here are some of the more interesting Polymarket markets open”. It most often appears alongside Metaculus, Manifold, Kalshi.

Metadata

  • Category: Organizations
  • Mention count: 44
  • Issue count: 44
  • First seen: February 08, 2021
  • Last seen: March 03, 2026

Appears In

Source Context

Recovered passages from the original issue text. When the raw archive preserved outbound links inside the source passage, they are listed directly under the quote.

February 08, 2021 · Original source
Polymarket is another cryptocurrency-based prediction market. It's got about two dozen contracts open, and some of them are pretty big - $5 million plus! With that kind of money, we ought to be seeing some really good predicting! We're...not. Either there's a 6% chance that Donald Trump will be president again by March 31, or something's gone wrong.
Probably it's the second one. I tried to bet against Trump, but getting money into the market was pretty hard. You need USDCoins, a stablecoin related to Ethereum. Polymarket tries to let you buy them directly, but their app wanted me to give them a security code which never showed up, so I gave up on this. Instead I bought some USDC at Coinbase and tried to send them over. But along with the usual Ethereum gas fees, they have something called a relayer, which is supposed to collect my money and put it in my account. And it's apparently heavily backed up, and after two days my money is nowhere to be seen (though I believe them when they say that they're trying their hardest and it will probably percolate through the Ethereum network someday). Maybe everyone's having these kinds of issues and this is why the Trump contract hasn't adjusted? I'm not sure. I will keep you updated if my money ever materializes.
June 22, 2021 · Original source
I’m happy to report that getting money into Polymarket has gone from impossible to merely annoying. Non-Americans can apparently do it directly with a credit card; Americans will have to send USDC, separately send Ethereum to a different address to cover transaction fees, then wait ~10 minutes for everything to percolate through. My level of crypto knowledge is “can use Coinbase” and I was able to figure it out. There’s also apparently an easier way with a Metamask wallet, which I didn’t try.
Here are some of the more interesting Polymarket markets open:
For comparison, PredictIt has Adams at 64%, so some good convergence going on here. PredictIt’s NYC mayor market says “17 million shares traded” - if an average share is 50 cents, that means PredictIt has about $8.5 million in volume. Polymarket works a little differently and has yes/no markets for each candidate; the vast majority (a little over $1 million) is on the Yang market. So Polymarket has a little over 10% of the liquidity of PredictIt on this one.
July 27, 2021 · Original source
Polymarket remains a fun alternative way to learn about the news. I only heard about the monkeypox issue a few days ago, and hearing “22% chance of it spreading” is both faster and more useful than some article that dithers for a few paragraphs and finally concludes that “health officials warn Americans not to panic”. I would count it a minor victory if one day news sources routinely included this in their articles, eg “Polymarket, a major prediction engine, estimates a 22% chance that at least one other person will catch the disease.”
Extra credit for the last market, which seems to be successfully predicting a scalar instead of a binary outcome - I’ve seen Metaculus experiment with this technology, but this is the first time I’ve spotted it at Polymarket using real money.
Here’s Polymarket:
November 01, 2021 · Original source
— In the US, real-money prediction markets are still illegal, unless they’ve undergone the harrowing, expensive, and highly constraining process of registering as a securities exchange. The Commodity Futures Trading Commission, the relevant regulatory watchdog, is investigating Polymarket for not doing this. I hope everyone involved will be able to come to an agreeable solution instead of crushing what’s currently the leading prediction market or forcing it to become worse.
November 15, 2021 · Original source
Polymarket: 98% chance the Republican wins (he did)
You can imagine chaining this. In 2095, you ask people to predict the actual answer. In 2090, you ask people to predict the value of the 2095 market on December 31, 2095. In 2085, you ask people to predict the value of the 2090 market on December 31, 2090. The chain ends with you putting a market on Polymarket tomorrow asking what the market will think on December 31, 2025. This should work.
Polymarket:
December 27, 2021 · Original source
Polymarket, PredictIt, and Kalshi are silent on this question for now.
2: Balaji Srinivasan suggests using prediction markets to judge the winner of college debates: @pairagraph You might run formal debates. Have people sign in with ENS for proof of identity. And try a prediction market like @PolymarketHQ to see who won the debate, with Oxford-style scoring. ","username":"balajis","name":"Balaji Srinivasan","profile_image_url":"","date":"Mon Dec 13 19:06:50 +0000 2021","photos":[],"quoted_tweet":{},"reply_count":0,"retweet_count":2,"like_count":9,"impression_count":0,"expanded_url":{"url":"https://www.intelligencesquaredus.org/news/blog/what-oxford-style-debate-format","image":"https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/d3fb6907-a076-48c3-935a-0c835eda9ee3_1500x844.jpeg","title":"What Is the Oxford-Style Debate Format?","description":"The Oxford-style debate format involves a debate on a predetermined statement – also called a “motion” – from two opposing perspectives. The two sides either argue “for” or “against” the motion within a formalized structure. The Intelligence Squared U.S. debate series favors the Oxford-style format …","domain":"intelligencesquaredus.org"},"video_url":null,"belowTheFold":true}" data-component-name="Twitter2ToDOM"> I’m not sure I understand this very well yet but maybe someone else can explain it to me.
February 01, 2022 · Original source
PREDICTION MARKETS 88. No new real-money prediction market becomes bigger than Polymarket: 70% 89. Manifold Markets is still alive and active: 30% 90. New legal US real-money prediction market at least half as big as Kalshi: 5% 91. New illegal but easy-to-use market satisfying the above: 20% 92. I post my scores on these predictions before 2/1/23: 80%
February 07, 2022 · Original source
Long Live Polymarket
Polymarket got fined $1.4 million by the Commodity Futures Trading Commission and was ordered to cease noncompliant trading in the US.
Polymarket is probably the biggest prediction market currently available. US law considers unlicensed prediction markets to be somewhere between illegal gambling and illegal futures trading, ie definitely illegal. Polymarket and a few peers had survived anyway, through the “crypto is the Wild West and nobody has time to deal with all the illegal things happening there” exemption. Apparently they found time.
February 08, 2022 · Original source
1: I titled part of my post yesterday “RIP Polymarket”, which was a mistake. Polymarket would like to remind everyone that they are very much alive, with a real-money market available to anyone outside the US, and some kind of compliant US product (maybe a play-money market) in the works.
February 13, 2022 · Original source
1: The team behind Polymarket want me to clarify that despite the tone of my post about them they do still exist, they’re open for real-market trading outside the US, and they might have some kind of compliant US product in the future. I apologize for inadvertently implying they were dead.
February 14, 2022 · Original source
These run from about 48% to 60%, but I think the differences are justified by the slightly different wordings of the question and definitions of “invasion”. You see a big jump last Friday when the US government increased the urgency of their own warnings. I ignored this on Friday because I couldn’t figure out what their evidence was, but it looks like the smart money updated a lot on it. A few smaller markets that Clay didn’t include: Manifold is only at 36% despite several dozen traders. I think they’re just wrong - but I’m not going to use any more of my limited supply of play money to correct it, thus fully explaining the wrongness. Futuur is at 47%, but also thinks there’s an 18% chance Russia invades Lithuania, so I’m going to count this as not really mature. Insight Prediction, a very new site I’ve never seen before, claims to have $93,000 invested and a probability of 22%, which is utterly bizarre; I’m too suspicious and confused to invest, and maybe everyone else is too. (PredictIt, Polymarket, and Kalshi all avoid this question. I think PredictIt has a regulatory agreement that limits them to politics. Polymarket and Kalshi might just not be interested, or they might be too PR-sensitive to want to look like they’re speculating on wars where thousands of people could die.) What happens afterwards? Clay beats me again: For context: So it looks like forecasters expect that, conditional upon Russia invading at all, there’s an 80% chance they’ll take Mariupol in the east, a 66% chance they’ll take Kharkiv (also eastern, but only a third ethnic Russian and currently aligned with the central government), and only about a 30% chance they take Kyiv or Odessa. See also this thread full of speculation in the subreddit. As for me, I’m going all in on “yes” after seeing this tweet: Alexander Cube Last week I speculated that to truly realize the potential of prediction markets, we’d need one that was real money, easy to use, and easy to create markets on. Gustavo Lacerda and Nuno Sempere very kindly drew this picture and named it after me: Nobody has reached the promised land at the furthest point. But all three connected vertices are occupied. Augur is real-money and lets people create their own markets, (but it’s impossible to use - it’s made of complicated crypto contracts that nobody’s made a workable front end for yet). Polymarket is real money and easy to use (but doesn’t let people create their own markets; apparently they’re nervous about resolution disputes). Manifold is easy to use and lets people create their own market, but it’s not real money (they’re American and centralized, so they have to follow anti-gambling regulations). Manifold Markets Speaking of which, they’re open! As the cube suggests, Manifold is a site where anyone can create their own (play money) prediction market. They set the question and they decide when and how it resolves (with everyone else just out of luck if they decide to fake it or rug-pull). It’s a bold strategy, but boy oh boy are people liking it so far: Not actually in order This is a semi-randomly selected sample of Manifold markets, but let’s go through them one by one. The Ukraine market is the biggest on Manifold. It’s also deeply out of step with every other prediction market and the top non-prediction-market authorities - who are all giving numbers in the 50s and 60s. I don’t understand how this is so low - yes, play money < real money, but mostly because play money doesn’t get enough people betting. Here lots of people are betting - it’s the biggest market on the site, and since you only start with $1000 either twenty people have bet everything or more people have bet a fraction - but it’s still wrong. I tried to spend some play money to correct it and it snapped back to just as wrong as it was before. I have no explanation. Midnight The Stray Cat is the second biggest market on Manifold, just after Ukraine. I guess the Internet really liking cats shouldn’t be a surprise at this point. In case you need to do research first I’m told this is the cat in question: Props to Manifold for a bunch of markets like the third one on there, where they eat their own dog food by using their market to predict how their business decisions are going to go. ACX Bot has copy-pasted all of my predictions from 2022. At some point they should be able to compare their results with Zvi (ie a single very smart person), with the contest many of you entered (ie an average of formless crowdsourced predictions), and Metaculus (ie a non-monetary forecasting tournament). I’m looking forward to it! Most of you already know Lars Doucet, who’s written some great ACX posts on Georgism. I don’t know what possessed him to make a Joe Rogan Georgism interviewee market, unless he’s gunning for the position. Valinor is a group house on my street, with ~a dozen people living in and around it. We’ve been talking about fixing the backyard for a while. Now we can bet about whether it will happen. Having a number for this actually affects some of my decisions a little. Connor is hijacking the prediction market to make a poll, which is pretty cute. Dwayne Johnson does not have a 15% chance of winning the election. Manifold is suffering from the usual play money problem, where if you only start out with $1000 in play money, nobody wants to lock it up for three years to make a 15% profit. Vivek’s market, “Will I believe that 13177 is a prime number”, is pretty unusual. I’m interpreting it as a test/demonstration of prediction markets’ information-gathering ability. If you don’t know something and it’s hard to Google, you can make a prediction market about whether you’ll believe it in the future, and people who are able to figure out the answer will bet on it. Based on the 97% YES rate, I’m guessing 13177 is in fact a prime number. What else can you do this with? TANSTAAFL’s “Will I Be Convinced That Justin Trudeau Is Not Fidel Castro’s Son?” market is maybe pushing the limit of this methodology. Anyway, there are lots of me-too prediction markets but this is something genuinely new under the sun. Maybe it will be awesome itself, but I’m also hoping it helps bigger players realize how much more is possible. This Week In Metaculus A few new questions on intelligence enhancement, eg: The question explicitly allows embryo selection, but says it must raise IQ ten points and be available for <25% median income to count. Trivial improvements to existing embryo selection will top out around 9 points, so this seems to be predicting something more interesting, maybe iterated embryo selection at the very least. I’m probably slightly bearish on this one; I believe if it existed someone would find a way to get it, but I think the regulatory climate might be able to prevent the relevant research indefinitely. Improving adult IQ is really hard. This is a bold thing to speculate about! Atmospheric CO2 was 300ish for most of pre-industrial history, 400ish now, and rising. This question predicts 600 in 2100, which sounds like what happens if global warming gets a bit worse but eventually stabilizes. I’m less sure. I think if we make it to 2100, we’ll have so much technology that atmospheric CO2 can be whatever we want it to be. But maybe we’ll want it to stay where it is; once there’s been a lot of global warming and people have moved / shifted lifestyles, it could be equally disruptive to cool the planet back down. Right now it’s 5%, the official government prediction is 10% by 2030, but this market says 17.6%. But look at that probability distribution! It’s a lot of people saying 10%ish, plus a very long tail of very big numbers. I think people are disagreeing about how exponential this change is going to be. Shorts Metaculus is holding an essay contest for people who want to use their AI-related prediction markets to argue the future of AI. $6500 available in prizes.
Nobody has reached the promised land at the furthest point. But all three connected vertices are occupied. Augur is real-money and lets people create their own markets, (but it’s impossible to use - it’s made of complicated crypto contracts that nobody’s made a workable front end for yet). Polymarket is real money and easy to use (but doesn’t let people create their own markets; apparently they’re nervous about resolution disputes). Manifold is easy to use and lets people create their own market, but it’s not real money (they’re American and centralized, so they have to follow anti-gambling regulations). Manifold Markets Speaking of which, they’re open! As the cube suggests, Manifold is a site where anyone can create their own (play money) prediction market. They set the question and they decide when and how it resolves (with everyone else just out of luck if they decide to fake it or rug-pull). It’s a bold strategy, but boy oh boy are people liking it so far: Not actually in order This is a semi-randomly selected sample of Manifold markets, but let’s go through them one by one. The Ukraine market is the biggest on Manifold. It’s also deeply out of step with every other prediction market and the top non-prediction-market authorities - who are all giving numbers in the 50s and 60s. I don’t understand how this is so low - yes, play money < real money, but mostly because play money doesn’t get enough people betting. Here lots of people are betting - it’s the biggest market on the site, and since you only start with $1000 either twenty people have bet everything or more people have bet a fraction - but it’s still wrong. I tried to spend some play money to correct it and it snapped back to just as wrong as it was before. I have no explanation. Midnight The Stray Cat is the second biggest market on Manifold, just after Ukraine. I guess the Internet really liking cats shouldn’t be a surprise at this point. In case you need to do research first I’m told this is the cat in question: Props to Manifold for a bunch of markets like the third one on there, where they eat their own dog food by using their market to predict how their business decisions are going to go. ACX Bot has copy-pasted all of my predictions from 2022. At some point they should be able to compare their results with Zvi (ie a single very smart person), with the contest many of you entered (ie an average of formless crowdsourced predictions), and Metaculus (ie a non-monetary forecasting tournament). I’m looking forward to it! Most of you already know Lars Doucet, who’s written some great ACX posts on Georgism. I don’t know what possessed him to make a Joe Rogan Georgism interviewee market, unless he’s gunning for the position. Valinor is a group house on my street, with ~a dozen people living in and around it. We’ve been talking about fixing the backyard for a while. Now we can bet about whether it will happen. Having a number for this actually affects some of my decisions a little. Connor is hijacking the prediction market to make a poll, which is pretty cute. Dwayne Johnson does not have a 15% chance of winning the election. Manifold is suffering from the usual play money problem, where if you only start out with $1000 in play money, nobody wants to lock it up for three years to make a 15% profit. Vivek’s market, “Will I believe that 13177 is a prime number”, is pretty unusual. I’m interpreting it as a test/demonstration of prediction markets’ information-gathering ability. If you don’t know something and it’s hard to Google, you can make a prediction market about whether you’ll believe it in the future, and people who are able to figure out the answer will bet on it. Based on the 97% YES rate, I’m guessing 13177 is in fact a prime number. What else can you do this with? TANSTAAFL’s “Will I Be Convinced That Justin Trudeau Is Not Fidel Castro’s Son?” market is maybe pushing the limit of this methodology. Anyway, there are lots of me-too prediction markets but this is something genuinely new under the sun. Maybe it will be awesome itself, but I’m also hoping it helps bigger players realize how much more is possible. This Week In Metaculus A few new questions on intelligence enhancement, eg: The question explicitly allows embryo selection, but says it must raise IQ ten points and be available for <25% median income to count. Trivial improvements to existing embryo selection will top out around 9 points, so this seems to be predicting something more interesting, maybe iterated embryo selection at the very least. I’m probably slightly bearish on this one; I believe if it existed someone would find a way to get it, but I think the regulatory climate might be able to prevent the relevant research indefinitely. Improving adult IQ is really hard. This is a bold thing to speculate about! Atmospheric CO2 was 300ish for most of pre-industrial history, 400ish now, and rising. This question predicts 600 in 2100, which sounds like what happens if global warming gets a bit worse but eventually stabilizes. I’m less sure. I think if we make it to 2100, we’ll have so much technology that atmospheric CO2 can be whatever we want it to be. But maybe we’ll want it to stay where it is; once there’s been a lot of global warming and people have moved / shifted lifestyles, it could be equally disruptive to cool the planet back down. Right now it’s 5%, the official government prediction is 10% by 2030, but this market says 17.6%. But look at that probability distribution! It’s a lot of people saying 10%ish, plus a very long tail of very big numbers. I think people are disagreeing about how exponential this change is going to be. Shorts Metaculus is holding an essay contest for people who want to use their AI-related prediction markets to argue the future of AI. $6500 available in prizes.
February 22, 2022 · Original source
(I will say that big institutions have been less risk-averse than I worried - I hear Google stopped requiring masks in the Bay Area today, and Polymarket says 79% chance of no mask mandate on domestic flights by November - so maybe this won’t come to pass after all)
March 01, 2022 · Original source
— Will Zelensky still be President of Ukraine on 4/22/22? 42% chance Polymarket seems hesitant to go into actual war predictions, but this market at least acts as a proxy for whether there will be a Ukraine on 4/22/22 - though with a side of “will Zelensky be killed or captured?”. “Yes” dropped as low as 12% during the early parts of the invasion, but is doing a little better now.
Polymarket seems hesitant to go into actual war predictions, but this market at least acts as a proxy for whether there will be a Ukraine on 4/22/22 - though with a side of “will Zelensky be killed or captured?”. “Yes” dropped as low as 12% during the early parts of the invasion, but is doing a little better now.
March 14, 2022 · Original source
Will Zelinskyy no longer be President of Ukraine on 4/22?: 63% —→20%
But Polymarket has the same problem:
March 21, 2022 · Original source
Will Zelinskyy no longer be President of Ukraine on 4/22?: 20% —→15%
Now that there are two good-sized real-money prediction markets, we can compare them. For example, the first question, on Putin, is at 79.5%, which is reassuringly close to the same question on Polymarket, at 76%.
In fact, I think a coordinated yearly question set to use as a benchmark could be really good for this space. Right now there’s no easy way to compare eg Metaculus to Polymarket because they both use really different questions. I’m hoping to get people together next year, come up with a standard question set, and give it to as many platforms (and individuals!) as possible to see what happens.
May 10, 2022 · Original source
The red line marks the Supreme Court leak. After a month of near-stability, Democrats’ chances went from 22% to 29%, before stabilizing around 26%. Markets on the Senate and on other sites like Polymarket tell a similar story. This is as far as we can go without using Manifold. Manifold questions have much less volume than PredictIt or Metaculus, and I have much less confidence in them, but for the record, here are a few: Disclaimer: I moved that one a bit myself, it was around 77% and I thought that was too high. Despite the fearmongering, this one looks about right to me. Disclaimer that Manifold probably can’t handle probabilities this small correctly and there’s no reason to think 0.2% is more realistic than 2%. It’s not 10% though. I couldn’t find some markets I wanted, so I’ve created them on Manifold for you to bet on: Will the Supreme Court leaker’s identity be known by 2023?
First the proof. Suppose that there was some specific expert who consistently outperformed prediction markets. For example, suppose Nate Silver was on average better than Polymarket. After this had been happening for a while, you would catch on. And then whenever Polymarket and Nate disagreed, you could bet money on Nate’s position on Polymarket and win. The exact amount you could make would depend on how much money was on the relevant Polymarket question and how strongly Nate and Polymarket disagreed, but as Polymarket gets bigger the limit tends toward infinity.
(right now, Nate Silver is better than PredictIt. I noticed this last election and made a few thousand dollars. If PredictIt had been as big then as Polymarket was now, either I would have made a few hundred thousand dollars, or - much more likely - someone else would have been incentivized to beat me to the trade, and the market would be as accurate as Nate by the time I looked at it).
June 13, 2022 · Original source
3: Here’s presidential nominees on PredictIt ($13,000,000 in liquidity), Polymarket ($30,000), and Manifold ($M3170):
PredictIt looks good, Manifold looks okay, Polymarket seems to have a long tail of implausible vanity candidates stuck around the 10% level.
July 12, 2022 · Original source
The most useful market I can find here is Polymarket’s question on Twitter getting removed from NYSE (a natural consequence of Musk taking it private):
I don’t know how to square this with Polymarket, unless they think that Twitter might get delisted for other, non-Musk-related reasons.
August 16, 2022 · Original source
So earlier last year, when the CFTC unexpectedly moved against the previously-tolerated illegal crypto site Polymarket, people suspected Kalshi’s lobbying. Now that they’re moving against PredictIt too, people suspect the same.
The community consensus so far seems to be to try to avoid Kalshi as long as it can. There are some good real-money prediction markets open to non-Americans: Polymarket, Futuur, Hedgehog, and Insight Prediction, although Americans will find visits prohibited nationally, and I would never recommend violating precepts negligently. You could also try play-money markets like Manifold, or market-adjacent forecasting sites like Metaculus.
From Polymarket and PredictIt:
October 18, 2022 · Original source
Sources: Manifold, CSPI, Metaculus, Polymarket, PredictIt, Insight, GJOpen The lowest forecaster is higher than the highest pollster! Taking 538 as an example, forecasters range from 5 pp higher (Manifold) to 17 pp higher (PredictIt). Tournaments and real-money markets tend to give higher numbers than play-money sites. I would go with 47% on this one, based on the convergence between GJO, CSPI, and Polymarket. CFTC vs. PredictIt (and everyone else), Part II The Commodity Futures Trading Commission is the US agency regulating prediction markets. In August, they told PredictIt (the biggest political prediction market) to shut down, effective in February. Now a motley group of stakeholders are suing the CFTC for a stay of execution. Plaintiffs include: 2 professors using the site as “a source of data for research”
Source: Polymarket
…but they disagree pretty heavily. Given that Polymarket has $75,000 of real money and Manifold has $3,000 of fake money, I’m trusting Polymarket here.
October 30, 2022 · Original source
1: Polymarket, Manifold, and PredictIt now have shiny interfaces for predicting the upcoming US midterm elections. In terms of the Republicans taking the Senate, Polymarket is at 65%, Manifold at 58%, PredictIt at 73%, and 538 at 49%.
November 21, 2022 · Original source
395 traders on this, so one of Manifold’s biggest markets, probably representative. The small print defines a major outage as one that lasts more than an hour. See here for a good explanation of why some people expect Twitter outages. Polymarket is within 2% of Manifold. Metaculus here has slightly stricter criteria but broadly agrees.
Polymarket is within 2% of Manifold. Metaculus here has slightly stricter criteria but broadly agrees.
Polymarket is within 2% of Manifold. Metaculus here has slightly stricter criteria but broadly agrees. 71 traders, still pretty good, but I find it meaningless without a way to distinguish between “everything collapses, Elon sells it for peanuts to scavengers” vs. “Elon saves Twitter, then hands it over to a minion while he moves on to a company building giant death zeppelins”. Oh, here we go. 20 traders, they think Musk will stay in charge. 23 traders. Twitter was profitable in 2018 and 2019, then went back to being net negative in 2020 and 2021 (I don’t know why) . I don’t think it’s been very profitable lately, so it would be a feather in Musk’s cap if he accomplished this. 24 traders. Twitter’s mDAU have consistently gone up in the past. DAU is slightly different and I think more likely to include bots. 26 traders. One thing I like about Manifold is that it lets you choose any point along the gradient from “completely objective” (eg Twitter’s reported DAU count) to “completely subjective” (eg whether the person who made the market thinks something is better or worse). This at least uses a poll as its resolution method. But the poll will be in the comments of this market, which means it will mostly be by people who invested in this market, who’ll have strong incentives to manipulate it. Maybe Manifold should add a polling platform to their service? 815 traders, one of the biggest markets of all time. It’s easy to see the jump where Musk unbanned Trump the other day. Trump has said that he doesn’t need to tweet because he prefers his own Truth Social network. This is a good business decision on his part, but hinges on him having enough impulse control to stick to his plan and avoid tweeting. The market thinks there’s a 25% chance he can do it! Polymarket again within 2% of Manifold. Only 23 traders here, and they’re a lot less optimistic than the Trump traders. FTX! 43 traders, seems like probably. I’ve seen a lot of Twitter takes about how rich well-connected people never get in trouble for this kind of thing, but the markets seem less cynical. 251 traders, and by the way amazing job by “mr22” who started this market on October 5. I also appreciate the relatively late end date - there’s another market “. . . by 2024” which is in the 30s, but that’s because people don’t trust the justice system to move quickly, not because they think he’ll be found innocent. There are a series of markets on sentence length which seem to suggest more than a month but less than a year in jail; this doesn’t really make sense to me and I’m going to nervously ignore them. Only 8 traders here, so take with a grain of salt, but this is a great example of the creative ways people are using Manifold. The market resolves not to “yes” or “no” but to the percent of FTX US users’ funds that they eventually get back; you make money if you were closer than other traders. Here they seem to think most people will only be getting about 14 cents on the dollar. There’s another market for FTX.US users which is a little higher at 29. 34 traders. I think this is too high; I bet it was some random third-tier insider, just because there are more of them and they’re under less scrutiny. Moving on to the effects on effective altruism in particular (just assume I have all possible conflicts of interest here): 272 traders, check the detailed resolution criteria. I think the strongest case is something like the one described in this article, about Center for Effective Altruism leaders discussing concerns about Alameda Research in 2018. The article doesn’t give specifics but my guess is they were the same issues Kerry Vaughn describes here (though see the followup comment by an employee who left FTX, casting doubt on Vaughn’s claims). That means the market hinges on whether Vaughn’s allegations fit the resolution criteria that “the unethical behavior must have been related to fraudulent investment strategies that involve spending other people's money without their permission”. Vaughn describes “poor capital controls, including a lack of distinction between money owned by investors and money owned by Alameda itself”, which sounds like it’s in that direction but could cover a wide variety of badness levels. My guess is everyone will end up agreeing that disgruntled Alameda employees whisper-networked that some things were bad about the company in 2018, some of the rumors got to CEA leaders, the leaders debated whether this was worse than normal for a tech startup, decided it didn’t rise to a level where they needed to publicly freak out, and moved on. Isaac will have to pay attention to the details as they come out and decide whether or not it qualifies. 45 traders. This seems to confirm that the CEA incident is responsible for most of the probability mass above; many fewer people think the FTX Future Fund (ie the charitable branch of FTX responsible for giving out their money) was in on this. Related: this market only has five traders, but I’m highlighting it anyway in the hopes that it gets more. The most money is on 2022. My guess is that we’ll find that they had terrible accounting practices in 2018-2019 of the sort that could be classified as criminally incompetent in a way that bled into fraud (but the trades went fine so nobody was harmed) and then they ramped it up a lot in 2022 to deal with the crypto crash. I think this market will be harder to resolve than people expect. 47 traders. Everyone is panicking about this possibility, but it looks like it’s not too likely. 10 traders. I’ll take this chance to say: a lot of media is predicting the death of EA, or a major blow to EA, or something in that category. Not going to happen. The media isn’t good at understanding people who do things for reasons other than PR. But most EAs really believe. Like, really believe. If every single other effective altruist in the world were completely discredited, I would just shrug and do effective altruism on my own. If they instituted the death penalty for effective altruism, I would do it under cover of night using ZCash. And I’m nowhere near the most committed effective altruist; honestly I’m probably below average. “Saint gets eaten by lions in Colosseum, can early Christianity possibly survive this setback?” Update your model or prepare to be constantly surprised. 6 traders. So, we lost several hundred million dollars of funding in a giant disaster which was also morally outrageous and demoralizing. It happens. But lots of people have already emailed me asking how to send in more money to help fill the gap. Some added something like “it was so depressing that all the FTX money meant my money didn’t make a difference, but now I can help again, and it’s great!” Can these people fill the hole? 32% chance that they can! 10 traders. And if they don’t, we’ll still probably do better than in 2021, before all the FTX money started rolling in. We’ll try harder to hammer in the point about not doing “ends justify the means” reasoning, and do some reorgs and purges to prevent anything like this from happening again, we’ll make a bunch of other changes - some reasonable, some panic-driven - but we’ll go on. If all the far-future stuff collapses, we’ll donate to global health charities. If the global health charities don’t work, we’ll fund GiveWell to sit around and figure out something that does. If GiveWell gets hit by an asteroid, we’ll work on asteroid deflection (actually I think we might already be doing that). If asteroid deflection turns out to be -EV, we’ll switch to shrimp welfare, or give ourselves Zika virus, or any of a million other things. You have no idea how committed we are to continuing to do effective altruism regardless of whether or not it’s “popular”. But it will be popular. 45 traders, resolution criteria at the link, notice the dip when the FTX news broke, followed by recovery as people had time to think it over more. Moving on to slightly less serious topics: The snapshot doesn’t show this, but one of the suggestions is Atlas Rugged. 67 traders, interesting to see where forecasters’ priorities lie. This was a big rumor early on, along with “everyone was on meth”, but the on site psychiatrist said it was false during an interview. 13 traders. WHY DO PEOPLE KEEP GOING ON PODCASTS? Midterms! That was two weeks ago? It feels like years! A week before the midterms, I wrote: Polymarket, Manifold, and PredictIt now have shiny interfaces for predicting the upcoming US midterm elections. In terms of the Republicans taking the Senate, Polymarket is at 65%, Manifold at 58%, PredictIt at 73%, and 538 at 49%. Congratulations 538! Mike Saint Antoine (who wrote the review of Viral in the last Book Review Contest) has put some more work into scoring midterm election forecasts. Here are some headline results: Mike writes: The reason I didn’t just do a three-way comparison between PredictIt, FiveThirtyEight, and Manifold Markets is that the Manifold Markets forecasts included fewer questions than the PredictIt and FiveThirtyEight forecasts. So in order to do a fair comparison here, I’ll be comparing the smaller subset of questions for which PredictIt and Manifold Markets both gave a forecast. So it looks like both Manifold and 538 did better than PredictIt, and there’s no clear way to tell which of the former did better. (except I guess you could do this analysis with just the subset of questions Manifold and 538 share, but Mike didn’t and I’m also not going to). PredictIt has a pretty consistent Republican bias (it’s a minor epistemic sin to accuse a prediction market of having a predictable bias unless you’ve made money exploiting it, I made $600 this election so I’ll let myself pass). In years when Republicans do better than expected, it will probably look better than other markets; in years when they do worse, it will look worse. Still, this is a bias, so I think we should take them doing worse this year as a fair reflection of their accuracy, even thought next year it could go the other way. My main two takeaways here are: PredictIt isn’t yet good enough that the ideal theorems showing prediction markets should be unbiased and better than everyone else apply to it. The obvious explanation is its $800-per-question cap. Polymarket doesn’t have that cap and it did better, although Mike hasn’t done a formal comparison to 538.
December 20, 2022 · Original source
People don’t trust the prediction market. If there is a 10% mispricing, but a 15% chance that the prediction market will go bust, steal your money, or wrongly resolve the question against you - then on average you lose 5% by correcting the mispricing! This doesn’t mean it’s impossible to have a truly accurate prediction market. There are two markets right now - Polymarket and Kalshi - where you can easily make $10,000+ by correcting mispricings. Unsurprisingly, I can’t find any big obvious mispricings on these markets! 2.4: Is there any empirical evidence comparing real prediction markets to experts? Yes. As predicted, prediction markets are about as accurate as Nate Silver. See Maxim Lott’s analysis here. More generally, studies usually find that prediction markets beat the average person, various experts, and various other methods like election polling. They are somewhere between equal-to and slightly-worse-than complicated aggregation algorithms, but these complicated aggregation algorithms are rarely used in real life, and I consider them to be “prediction markets lite”, ie part of the same toolbox of forecasting technologies. Hanson (2007) says (I have not individually checked each claim): Remarkably, in every known head-to-head field comparison between speculative markets and other forward-looking institutions, the speculative markets have been at least as accurate. More often than not, they prevail. Orange juice futures improve on National Weather Service forecasts (Roll 1984), horse race markets beat horse race experts (Figlewski 1979), Oscar markets beat columnist forecasts (Pennock, Giles, and Nielsen 2001), gas demand markets beat gas demand experts (Spencer 2004), stock markets beat the official NASA panel at identifying the company responsible for the Challenger accident (Maloney and Mulherin 2003), election markets beat national opinion polls (Berg, Nelson, and Rietz 2003), and corporate sales markets beat official corporate forecasts (Chen and Plott 2002). 2.4.1: If prediction markets are only as good as Nate Silver or other experts, who cares? Why don’t we just skip the prediction markets and listen to the experts directly? See the next section, “Why believe prediction markets are canonical?” 2.4.2: What if there are many equally impressive experts who disagree? What do prediction markets do then? See the next section, “Why believe prediction markets are canonical?” 3. Why believe prediction markets are canonical? By canonical, I mean that in ideal cases: all prediction markets speak with a single voice
Operate outside the United States, closed to US citizens. Polymarket fills this niche effectively.
Operate using play-money only. Here Manifold is the leader. You could also think of superforecasting tournaments like Metaculus as a version of this. I claim that the main reason prediction markets haven’t fulfilled their potential and become a major pillar of worldwide decision-making is that none of these solutions are really adequate. For whatever reason, most people interested in prediction markets are American, so Polymarket has a limited userbase. The regulators are pretty harsh, so the companies that strike deals to get exemptions usually have to trade away most of their functionality. Kalshi can only ask a few specific regulator-approved questions; the limits are so harsh that they’re not even allowed to predict elections. Play-money prediction markets like Manifold are a lot of fun, but there’s a limit to how much work people will do to earn play money. I want a world where the people who are best at correcting mispricings in prediction markets can make full-time jobs out of it, and where there are prediction market equivalents of Goldman Sachs where hundreds of brilliant people work together with cut-throat efficiency to find mispricings the moment they appear. Play money won’t get us there. Real money prediction markets tend to have between four- and six-digit (very occasionally seven-digit) volumes on most questions. Play money prediction markets have between one- and four-digit numbers of traders on most questions. Most big prediction markets are usually within 10% of each other and the best outside experts, but not always within 1%. Traditional financial markets are usually within 1% of each other, so I think this is because the prediction markets are still too small to have sub-1% accuracy. I hope that as they grow bigger they can reach this milestone. 7. What can I do to help promote prediction markets? If you’re an ordinary person with no special expertise or skills, I think the best thing you can do is create a Manifold Markets account, bet on topics that are interesting to you, and create markets for any interesting topics that don’t have one yet. I think this could be helpful for a few reasons: It’s hard to really understand prediction markets until you’ve played a few yourself.
December 21, 2022 · Original source
But there’s an even deeper heuristic, something like “try it and see!” Don’t do this with communism, unfriendly AI, or anything else with catastrophic failure modes that people can’t opt out of. But I look at Polymarket and see that it seems to run pretty well despite a dozen theoretical objections, and this makes me trust theory less. We should still debate the theoretical questions, if only for the mental tranquility of feeling like we understand what’s going on. But overall I expect these kinds of objections to be overcome by creating the thing, subjecting it to market pressure, and seeing what solutions people come up with.
April 25, 2023 · Original source
We’ve talked before about LLMs playing chess; they can sort of do it, but they’re not very good yet. The market thinks 34% chance they’ll get much better in the next five years; I think my estimate is lower. Polymarket is dipping its toes into AI forecasting. This particular one is off to a tough start: GPT-4 came out a month or so after this market was launched, but OpenAI hasn’t said how many parameters it has. You can see all open AI questions (currently just three) here. Also on Polymarket:
Polymarket is dipping its toes into AI forecasting. This particular one is off to a tough start: GPT-4 came out a month or so after this market was launched, but OpenAI hasn’t said how many parameters it has. You can see all open AI questions (currently just three) here. Also on Polymarket:
Polymarket is dipping its toes into AI forecasting. This particular one is off to a tough start: GPT-4 came out a month or so after this market was launched, but OpenAI hasn’t said how many parameters it has. You can see all open AI questions (currently just three) here. Also on Polymarket: Manifold is about the same on the same question. Metaculus’s fancy date prediction system lets them be more specific:
August 09, 2023 · Original source
33: Claim: phase transition in Cu2S impurity fully explains superconductor-like properties of supposed “room temperature superconductor” LK-99 (paper, Twitter discussion). Prediction markets on Manifold and Polymarket are down from high-30s% last week to ~10% now.
August 28, 2023 · Original source
NinthCause and SG are Manifold co-founders. Jack, Marcus Abramovich, and Michael Wheatly are Manifold leaderboard record holders. Peter Wildeford is a superforecaster who came near the top in the ACX forecasting contest. Matthew Barnett works in AI forecasting. You all know Eliezer and Zvi. As far as I can tell nobody high up on the YES side is similarly illustrious. But prediction markets are supposed to ensure you don’t have to resort to name-dropping, so how did this go wrong? I was tempted to blame Manifold-specific factors, like the ability to get starting mana instead of putting skin in the game. But real-money markets Polymarket and Kalshi got approximately the same results: Polymarket: https://polymarket.com/event/is-the-room-temp-superconductor-real Kalshi: https://kalshi.com/markets/supercon/roomtemp-superconductor-reported Both reached the 40s to 50s! I think there just wasn’t enough smart money to drown out the people who wanted to bet on an exciting thing being true, or who were unduly influenced by a social media environment optimized to keep their attention by convincing them that an exciting thing was true. I have never claimed prediction markets are always good. All I wrote in the Prediction Market FAQ was that either a prediction market will be good, or you could make lots of free money. In this case, it was the second one. I regret I only made $30. I do hope this situation will improve over time, as over-eager forecasters get burned and dollars flow from dumb money to smarter. [EDIT: I should have included something about Metaculus here, but it’s confusing. I think the most popular Metaculus market was lower because it had stricter resolution criteria (the first replication had to be positive, instead of any replication) but that otherwise Metaculus raw probabilities mirrored everyone else’s. We don’t know how their algorithmically processed probabilities did yet and I’ll report on that information when I get it.] Salem/CSPI Tournament Winners The Salem Center and the Center For The Study Of Partisanship And Ideology, two think tanks associated with right-wing intellectual Richard Hanania, sponsored a prediction market tournament last year. Participants got $1000 in play money to bet on selected markets about current events; winners would be interviewed for a well-paying academic sinecure at one of the think tanks. Now the tournament is over. Winners have yet to be announced, but unofficially, everyone knows who they are: First place out of 999 participants is zubbybadger. Zubby is a prediction market veteran who was featured in a Washington Monthly article last year for his great track record in political betting (he’s made > $150,000 on PredictIt). Now he works as a “community manager” for Kalshi (I don’t know what this entails). Second place was Robert from Considerations On Codecrafting. He’s written a detailed reflection on his experience (part one, part two) which is my main source for this section and highly recommended. He describes himself as “having absolutely no experience with prediction markets”. Third place was Johnny Ten-Numbers, about whom I can find no further information. You can see the rest of the top 20 at the very bottom of this post. Reading Robert’s story of his experience, I’m struck by how little of the competition at the top was about predictive accuracy. Everyone in the top 20 was a very accurate predictor (Exactly equally accurate? Hard to tell.) What separated 1st place from 20th, aside from luck, was things like: Ability to move fast - both in responding to news, and in taking the other side of bad bets. Several top performers programmed bots to give them an edge here.
Polymarket: https://polymarket.com/event/is-the-room-temp-superconductor-real Kalshi: https://kalshi.com/markets/supercon/roomtemp-superconductor-reported Both reached the 40s to 50s! I think there just wasn’t enough smart money to drown out the people who wanted to bet on an exciting thing being true, or who were unduly influenced by a social media environment optimized to keep their attention by convincing them that an exciting thing was true. I have never claimed prediction markets are always good. All I wrote in the Prediction Market FAQ was that either a prediction market will be good, or you could make lots of free money. In this case, it was the second one. I regret I only made $30. I do hope this situation will improve over time, as over-eager forecasters get burned and dollars flow from dumb money to smarter. [EDIT: I should have included something about Metaculus here, but it’s confusing. I think the most popular Metaculus market was lower because it had stricter resolution criteria (the first replication had to be positive, instead of any replication) but that otherwise Metaculus raw probabilities mirrored everyone else’s. We don’t know how their algorithmically processed probabilities did yet and I’ll report on that information when I get it.] Salem/CSPI Tournament Winners The Salem Center and the Center For The Study Of Partisanship And Ideology, two think tanks associated with right-wing intellectual Richard Hanania, sponsored a prediction market tournament last year. Participants got $1000 in play money to bet on selected markets about current events; winners would be interviewed for a well-paying academic sinecure at one of the think tanks. Now the tournament is over. Winners have yet to be announced, but unofficially, everyone knows who they are: First place out of 999 participants is zubbybadger. Zubby is a prediction market veteran who was featured in a Washington Monthly article last year for his great track record in political betting (he’s made > $150,000 on PredictIt). Now he works as a “community manager” for Kalshi (I don’t know what this entails). Second place was Robert from Considerations On Codecrafting. He’s written a detailed reflection on his experience (part one, part two) which is my main source for this section and highly recommended. He describes himself as “having absolutely no experience with prediction markets”. Third place was Johnny Ten-Numbers, about whom I can find no further information. You can see the rest of the top 20 at the very bottom of this post. Reading Robert’s story of his experience, I’m struck by how little of the competition at the top was about predictive accuracy. Everyone in the top 20 was a very accurate predictor (Exactly equally accurate? Hard to tell.) What separated 1st place from 20th, aside from luck, was things like: Ability to move fast - both in responding to news, and in taking the other side of bad bets. Several top performers programmed bots to give them an edge here.
1: We’re probably getting to that point in the cycle when we’re going to have to include these every month, aren’t we? Source: https://polymarket.com/event/who-will-win-the-us-2024-democratic-presidential-nomination Source: https://polymarket.com/event/who-will-win-the-us-2024-republican-presidential-nomination I think Newsom and maybe RFK are overpriced, everything else here seems reasonable.
September 11, 2023 · Original source
2: Manifold Markets wants me to remind you that this is approximately your last chance to sign up for Manifest, their forecasting and prediction market conference in Berkeley, CA. Guests will include Nate Silver, Robin Hanson, Aella, Zvi, and the CEOs of Kalshi, Manifold, and Polymarket. I’m still figuring out if I can make it but I’ll try my best.
October 09, 2023 · Original source
For example, the best-funded project was “subsidize real money prediction markets on Polymarket”. The project worked, it was fine, nothing went wrong. But the judges (including me) had trouble figuring out why this was better than the many other real money prediction markets that happen on Polymarket all the time, and felt like we already had good data on how real-money markets compare to play-money ones. This was something we didn’t really want, but our investors spent $8,000 on it.
October 31, 2023 · Original source
And many more. Finally, no discussion of Manifest would be complete without mentioning these shirts: And speaking of Polymarket, they were present in force and promising great things. I am sworn to secrecy on some of them, but they were pretty public about their plans to eventually let users to create real-money markets on topics of their choice, so watch this space. Manifold.love They finally did it and made good on their threats to open up a prediction market dating site, manifold.love: What’s the prediction angle? For any user, you can suggest a match with any other user on the site, and bet on the chance that the match will work (last at least six months): So far it has the normal problem - not enough women - but otherwise seems fully functional and much more user-friendly than most dating sites. I’ll look into this more later, but some brief preliminary thoughts: Is “chance of a six month relationship” conditional on dating at least once, or unconditional? If unconditional, is there any way for it to resolve no?
I can’t find any markets on the Middle East topic I’m actually interested in, which is Israel’s medium-term plan. Will they kill some Hamas leaders, then get out? Install a puppet government? Permanently occupy Gaza like they’re occupying the West Bank? These all seem like bad options, but they’re very different bad options, and I haven’t seen much speculation about which is most likely.
December 05, 2023 · Original source
Source: Polymarket Source: Kalshi Second one is out of order, but these basically agree. Why is Taylor Swift so high? I understand she’s a very famous pop star, but hasn’t she been an equally famous pop star every one of the past ten years?
January 30, 2024 · Original source
…Metaculus and PredictIt are 50-50, Manifold favors Biden, and Polymarket favors Trump. Shouldn’t really be possible, should it?
This is probably the problem Johnson mentioned above. Manifold has lots of young people and rationalists, who probably lean Biden. Polymarket has lots of degenerate gamblers who use VPNs, who probably lean Trump.
I do think you could probably make a lot of real money in expectation buying Biden on Polymarket - if you were a degenerate gambler with a VPN, of course.
February 20, 2024 · Original source
How many residents will live in Prospera, a new special economic zone in Honduras, on Jan 1, 2026? Answer: 600 (80% confidence interval 100-2,000) This seems like a good guess (except that my confidence interval would have included zero because there’s a 20%+ chance that it gets shut down). So overall its forecasts seem pretty impressive. But I was concerned by its reasoning even in some of the questions it got “right”. For example, the Nikki Haley question tried to get a base rate by asking what percent of elections Haley had won before, and found she had won 71% of them - these were mostly elections for South Carolina governor. You can see what the AI is trying to do - but it’s not going to work. Then it got confused and read a lot of news stories about how she’s currently losing the 2024 presidential election, and seemed to think they were about 2028. So either the AI only got a reasonable probability by coincidence, or it was testing many different strategies, throwing out the useless ones, and updating only on the useful ones, in a way that was kind of opaque to the casual reader. Still, if the company says it beats most human forecasters, this doesn’t seem totally impossible based on what I’ve seen. And that would be exciting! An AI that can generate probabilistic forecasts for any question seems like in some way a culmination of the rationalist project. And if you can make something like this work, it doesn’t sound too outlandish that you could apply the same AI to conditional forecasts, or to questions about the past and present (eg whether COVID was a lab leak). I would be most excited if at some point this graduated from its geopolitical focus and was able to answer questions on any topic (eg “what is the chance that Astral Codex Ten gains paid subscribers this year?”), maybe if the questioner gives it links or feeds it some of the appropriate information. FutureSearch is run by a team formerly from Metaculus, including former Metaculus CTO (and Google internal prediction market veteran) Dan Schwarz. They’re looking for potential clients and/or investors; if you’re interested, email hello@futuresearch.ai. Vitalik On AI Prediction Markets Vitalik Buterin, Ethereum-founder-turned-cryptocurrency-public-intellectual, has a blog post on The Promise And Challenge Of Crypto + AI Applications. One of them is a prediction market. He writes: Prediction markets have been a holy grail of epistemics technology for a long time; I was excited about using prediction markets as an input for governance ("futarchy") back in 2014, and played around with them extensively in the last election as well as more recently. But so far prediction markets have not taken off too much in practice, and there is a series of commonly given reasons why: the largest participants are often irrational, people with the right knowledge are not willing to take the time and bet unless a lot of money is involved, markets are often thin, etc. One response to this is to point to ongoing UX improvements in Polymarket or other new prediction markets, and hope that they will succeed where previous iterations have failed. After all, the story goes, people are willing to bet tens of billions on sports, so why wouldn't people throw in enough money betting on US elections or LK99 that it starts to make sense for the serious players to start coming in? But this argument must contend with the fact that, well, previous iterations have failed to get to this level of scale (at least compared to their proponents' dreams), and so it seems like you need something new to make prediction markets succeed. And so a different response is to point to one specific feature of prediction market ecosystems that we can expect to see in the 2020s that we did not see in the 2010s: the possibility of ubiquitous participation by AIs. AIs are willing to work for less than $1 per hour, and have the knowledge of an encyclopedia - and if that's not enough, they can even be integrated with real-time web search capability. If you make a market, and put up a liquidity subsidy of $50, humans will not care enough to bid, but thousands of AIs will easily swarm all over the question and make the best guess they can. The incentive to do a good job on any one question may be tiny, but the incentive to make an AI that makes good predictions in general may be in the millions. Note that potentially, you don't even need the humans to adjudicate most questions: you can use a multi-round dispute system similar to Augur or Kleros, where AIs would also be the ones participating in earlier rounds. Humans would only need to respond in those few cases where a series of escalations have taken place and large amounts of money have been committed by both sides. This is a powerful primitive, because once a "prediction market" can be made to work on such a microscopic scale, you can reuse the "prediction market" primitive for many other kinds of questions: Is this social media post acceptable under [terms of use]?
The dumbest possible way to do this is to ask GPT-4 to write a summary (“write the summary of a plot for a detective mystery story”), then ask it to convert the summary into a 100-point outline, then convert that into 100 minutes of a 100-minute movie, then ask Sora to generate each one-minute block. This wouldn’t work as written now (I don’t think Sora can do sound, it wouldn’t keep actors and style consistent unless you forced it), but it seems like something that requires incremental improvement rather than a grand breakthrough. Some politics topics courtesy of Polymarket. Look at those amounts of money!
Some politics topics courtesy of Polymarket. Look at those amounts of money!
March 12, 2024 · Original source
Are these the data I’ve been trying to get for years - which forecasting platforms beat which others? I don’t think so - Metaculus’ good Briar score only means it performs well on Metaculus’ questions, which might be easier or harder than some other platform’s questions. Can we use the Halawi et al AI as a fixed comparison point, since it’s always the same skill level? I’m not sure - it trained on each of these markets for the style of question that’s in each market, so it might be biased. Still, these numbers are all about where I would expect them to be, except maybe Polymarket, which does better than I would have expected. But the crowd still beats the AI, right? Halawi et al object that humans can forecast only when they feel like it - you can bet on a prediction market question you feel confident on, and avoid one you don’t. When they let their AI forecast only on those questions where it’s most likely to do well (eg those with lots of relevant news articles), it very slightly outperforms the human crowd. As AI gets better, will it naturally beat humans in forecasting? Halawi et al say this won’t be trivial. They find a version of their system based off GPT-3.5 is only very slightly worse than the final version built off GPT-4. This suggests a forecasting AI built off GPT-5 or 6 might get only small improvements. The second team is Tetlock et al. They start from the same place as Halawi - out-of-the-box LLMs aren’t good at forecasting. They’re more scathing about this than Halawi was - they argue that out-of-the-box models do worse than predicting 50% for everything (this was close to true of human forecasters in the ACX tournament). Instead of increasing quality, Tetlock increases quantity. He wants to do wisdom of crowds, where the crowd is a bunch of different LLMs. So he gets twelve LLMs - including Bard, GPT, Claude, Mistral, PaLM, LLaMa, some Chinese models I’d never heard of, and a couple of variations on these bases - asks them to predict questions, and averages the results. Remember, you gotta prompt your model with “you are a smart person”, or else it won’t be smart! The results: Next, we compare the LLM crowd performance to that of the human crowd for our second hypothesis, directly putting the two crowd-aggregation mechanisms head-to-head. To do this, we use the same LLM crowd average as before (taking the median LLM prediction on each question and averaging up the Brier scores across questions). We compare this to the average of median human predictions on the same questions. In our preregistered analysis, we fail to find statistically significant differences between the LLM crowd’s mean Brier score of M=0.20 (SD=0.12) and that of the human crowd, M=0.19 (SD=0.19), t(60) = 0.19, p = 0.850 Their study was much smaller than Halawi’s (31 questions vs. 3,672), so I don’t think this result (nonsignificant small difference) should be considered different from Halawi’s (significant small difference). Still, it’s weird, isn’t it? Halawi used a really complicated tower of prompts and APIs and fine-tunings, and Tetlock just got more LLMs, and they both did about the same. I have two questions after reading these results: Did they actually do the same, or is this just a function of the small sample size in Tetlock and the non-head-to-head comparison?
Crypto is back in the news, with Bitcoin at record highs again. Polymarket is crypto-based, so it shouldn’t be surprising that they have the highest-liquidity and most diverse crypto questions:
Crypto is back in the news, with Bitcoin at record highs again. Polymarket is crypto-based, so it shouldn’t be surprising that they have the highest-liquidity and most diverse crypto questions: And Futuur has an Infectious Diseases section I hadn’t seen before:
May 13, 2024 · Original source
Polymarket and Insight already IP ban US users and claim not to be operating in the US, so they shouldn’t be directly affected. But it might alter some case tangentially involving them one way or another.
People changed their minds a little over time, but not in a very consistent way that mattered much in the end. What was the “client feedback”? The report says: Client feedback was provided to the Superforecasters on December 21. The client posed questions to the Superforecasters about their assessments up to that date and asked for their reactions to several studies and articles. In the days following the client engagement, the Superforecasters lowered their confidence in the natural zoonosis hypothesis from 73% to 67%, although zoonosis remained the most likely potential cause in their assessment. But following an active engagement with recent genomic studies and historical base rates of zoonotic spillovers, those numbers began to return to earlier levels. January also saw increased attention to the geopolitical context and transparency issues, particularly related to research activities in Wuhan Is this bad? I’m imagining a pro-lab-leak client saying “But what about [this list of pro-lab-leak arguments]?” and then the superforecasters read them and adjust. In one sense, it’s good that they got to see more arguments; on the other, it seems like a potential route by which clients could bias the results - probabilities never quite got back to where they were before the feedback, though they got pretty close. The last-minute spike for zoonosis might be the Rootclaim debate results, which were released on 2/18. So maybe the client feedback and the Rootclaim results both slightly affected the numbers, but mostly the superforecasters started out pro-zoonosis and stuck to their guns. Dan Schwarz and the FutureSearch team say that forecasting has a “rationale-shaped hole”. Despite the report making this sound like a pretty intense process, we don’t get much information about details: In their extensive discussions , Good Judgment’s Superforecasters assessed base rates and historical patterns, existing evidence and scientific analysis, geopolitical context and transparency concerns, trust in intelligence communities, and methodological constraints. 1. Base Rates and Historical Patterns: The Superforecasters frequently referenced base rates, i.e., the history of pandemics emerging from natural zoonosis versus the history of laboratory leaks, to anchor their probabilities. For the former, they discussed how the base rates are changing as the climate warms and as expanding human populations push farther into natural environments that previously saw little human presence. For the latter, they acknowledged that it has only been 12 years since the advent of CRISPR gene- editing tools, and the base rate of lab leaks in the short synthetic biology era is not yet well established. 2. New Evidence and Scientific Analysis: Throughout the period, the Superforecasters adapted their forecasts in light of new scientific evidence, including genomic analyses of SARS-CoV-2 and its relation to bat viruses, and the debate over potential laboratory manipulation. 3. Geopolitical Context and Transparency Concerns: The geopolitical implications of the virus’s origins, particularly in relation to China’s transparency and the involvement of international research institutions, played a significant role in the analysis. Concerns over data veracity, and over the political ramifications of determining that the pandemic’s origins were other than zoonosis, were extensively debated. 4. Trust in Intelligence: Commentary on trust in intelligence communities and discussions about the impact of geopolitical biases on the interpretation of evidence illustrated the complex interplay between science, politics, and human behavior in assessing the pandemic’s origins. 5. Methodological Critiques and the Evaluation of Evidence: The Superforecasters engaged in methodological critiques of the evidence base, including the scrutiny of laboratory practices and biocontainment levels [...] In the end, most Superforecasters were in rough agreement on issues like the base rates of zoonotic spillover. Where they most often disagreed was on the interpretation of actions by Chinese officials and whether their actions reflected how an authoritarian government would react in any crisis over which it did not have full control, or whether those actions were indicative of attempts to cover up a biomedical research-related accident that allowed the SARS-CoV-2 virus to enter circulation in China and, ultimately, the entire globe. Probably it would be too much to ask for to get a transcript of all their discussions - then they’d be nervous saying things that might make them look bad to an audience. What would be a good balance between getting more information and not imposing on their time? Forecasting is an unusually legible and easy-to-judge domain. One of the theories of change for forecasting was to use it to identify smart people with good reasoning, then turn them loose on less well-behaved problems. This is one of the first big attempts to do this at scale. How did it work? We can’t tell, because it’s inherently an illegible and hard-to-judge domain. Darn. I don’t know what I expected. Notes From A Local Optimum Austin’s concern - that forecasting has reached a local optimum - is widely shared. We have some good sites: Manifold, Metaculus, Polymarket, GJO, etc - all doing good work. We have good-ish probabilities for a few important questions. Every so often a news source cites them. Sometimes a decision-maker looks at them behind the scenes, maybe. Is this all there is? The FutureSearch team says the next step is to focus on “rationale”. We need to use forecasting not just to get a raw probability, but to explain what’s going on and why we think something. Then instead of just convincing policy-makers to trust forecasts, we can tell them why something is true, or inform their discussions even if they’re not willing to blindly trust a number. Is this a betrayal of the forecasting ethos? The original dream was that instead of a bunch of people giving arguments, we could just test who was right. Now we’re going back to the arguments? People have argued forever; what does forecasting add to that? Well, they add the knowledge that the arguments are from people who have been right a lot before and are incentivized to be right again. Still, it’s not a natural fit. Probably it’s relevant here that FutureSearch’s forecasting AI does a really good job of this by default, in a way humans can’t match. Nuno’s yearly forecasting roundup doesn’t have a single thesis, but the first part is a well-supported complaint that most forecasting sites aren’t good business. They either burn VC money, burn EA donations, or converge towards casinos to support themselves. He gives an honorable exception to Cultivate Labs, which sells prediction market software rather than the results themselves. Open Philanthropy (billionaire Dustin Moskovitz’s EA-aligned charitable foundation) has at least given forecasting a vote of confidence, recently choosing to promote it to one of their main donation areas. Still, they got a lot of pushback on the decision, for example SuperDuperForecasting here: This will be a total waste of time and money unless OpenPhil actually pushes the people it funds towards achieving real-world impact. The typical pattern in the past has been to launch yet another forecasting tournament to try to find better forecasts and forecasters. No one cares, we already know how to do this since at least 2012! The unsolved problem is translating the research into real-world impact. Does the Forecasting Research Institute have any actual commercial paying clients? What is Metaculus's revenue from actual clients rather than grants? Who are they working with and where is the evidence that they are helping high-stakes decision makers improve their thought processes? Incidentally, I note that forecasting is not actually successful even within EA at changing anything: superforecasters are generally far more relaxed about Xrisk than the median EA, but has this made any kind of difference to how EA spends its money? It seems very unlikely. And Marcus Abramovich here: I'm in the process of writing up my thoughts on forecasting in general and particularly EA's reverence for forecasting but I feel, similar to @Grayden that forecasting is a game that is nearly perfectly designed to distract EAs from useful things. It's a combination of winning, being right when others are wrong and seemingly useful, all wrapped into a fun game. I'd like to see tangible benefits to more broad funding of forecasting that seems to be done in t he millions and tens of millions of dollars. I would also be the type of person you would think would be a greater fan of forecasting. I'm the number one forecaster on Manifold and I've made tens of thousands of dollars on Polymarket. But I think we should start to think of forecasting as more of a game that EAs like to play, something like Magic the Gathering that is fun and has some relations to useful things but isn't really useful by itself. Eli Lifland has a long and hard-to-summarize comment here, response from Ozzie Gooen here, podcast between them on “Is Forecasting A Promising EA Cause Area?” here. I’m split on this. My previous hope was that the field would gradually grow, without any qualitative changes or discontinuities, until it became big enough that journalists and policy-makers were aware of it and took it seriously (compare eg the growth of the Internet as a scholarly resource). I think the strongest argument against this is Manifold’s relatively flat user numbers. Is there a new hope? I think if nothing else, forecasting might be useful as a testing ground: First, to create forecasting AIs (like FutureSearch) which can then get consulted on a variety of questions, eg by policy-makers. The biggest holdup has always been the need to gather 20 or 50 or however many hard-to-find superforecasters for whatever question you’re asking, and then trust their advice even though they’re fallible fleshbag humans. If you can use the 20 to 50 superforecasters to inspire an AI, and then test the AI and prove it’s good, people might be more interested. This is especially true if the AI can branch out beyond traditional forecasting questions. Once we have a few of these, we can start comparing the next generation of AIs to the previous generation, and skip the superforecasters.
Here’s an embarrassing screwup from Metaculus. This question was about when there would be a “Great Power war”, with Great Powers defined as any country in the top ten of military spending. But surprise surprise, Ukraine getting invaded made them spend a lot of money on their military that year, so they rose to #8 in the world in military spending in 2023. Since Russia is also in the top ten, this qualifies as a “Great Power war” by the technical definition, and the question resolves positive. Moral of the story: resolution criteria are hard! Polymarket on 2024 election results. In the past they’ve had a Republican bias, but now their Presidential markets are in tune with everyone else in the polls, so maybe this is accurate.
July 02, 2024 · Original source
I assume they chose these three because they’re the only ones discussed enough to have enough data. I am following their lead. I appreciate John and Maxim’s work, but I’m not completely comfortable trusting it. Their model is based on results from Betfair, Smarkets, PredictIt, and Polymarket. But I don’t know much about the first two (as an American, I’m banned from even reading Betfair), and the latter two are notoriously bad at partisan political questions. They usually overestimate Republicans’ chances, partly because Democrats’ opposition to online political betting has turned the pool of online political bettors disproportionately red. While a fluid and easily-accessible prediction market should be able to avoid biases like these, neither PredictIt nor Polymarket really qualifies. The CFTC, which regulates prediction markets, has crippled both - PredictIt has very low maximum investments per market, and Polymarket is crypto-only and banned for US citizens. These have prevented their biases from being corrected and made both of them perform relatively weakly in head-to-head contests. And Stossel/Lott’s focus on betting sites automatically excludes two of the biggest and most historically accurate forecasting engines from their calculation - Metaculus and Manifold. In order to get numbers I trusted more than theirs, I looked at Metaculus, Manifold, PredictIt, and Polymarket, weighting each by how much I trusted it. Here’s what I found: The Biden number is about 4% higher than Nate Silver’s model over the same time period; see below for why that might be. [EDIT 7/2/24: Original version had a miscalculation which decreased everyone’s odds by about 10%. Above version should be correct.] You can find my sources at the bottom of the post. “Explicit” odds are based on questions like “What are the chances of Biden winning if he is the nominee?” “Implied” odds were generated by combining the questions “What is the chance of Biden being the nominee?” and “What is the chance of Biden winning?”; this is safe enough with Biden, but with unlikely nominees like Newsom, some of the percentages can get small enough that they start running into small-number-biases and become less trustworthy. I’ve weighted each market’s explicit calculation higher than their implicit one to compensate. A possible objection to these results: conditional probabilities don’t exactly reflect the intuitive concept of decision-making. That is, we’re not asking “We want to know whether or not to keep Biden, so what are the chances that he’ll win if we do?”, we’re asking the market for the chance that he’ll win, in the set of worlds where people decide to keep him for other reasons. We should expect this to overestimate his performance. That is, imagine that tomorrow, Biden has completely recovered, he easily wins his next debate with Trump, and everyone agrees the most recent debate was just a fluke - in that world, he is both more likely to be nominated and more likely to win. Alternatively, if tomorrow he gets much worse and can’t even speak in full sentences, he’s much less likely to be nominated and much more likely to lose. Since the real world includes both those possibilities, restricting ourselves to the set of worlds where he gets nominated means we’re overestimating the chance that he wins. There are similar-albeit-less-severe problems with other candidates - if we choose Newsom, that might be because he won some kind of debate or process versus Harris and all the other potential replacements. Overall I expect this to be mostly correct, but probably overestimate Biden’s chances by a percent or two relative to others. Along with these three candidates, Metaculus had an explicit “should the Democrats replace Biden?” question: Manifold also asks how Democrats will do if they replace Biden (without specifying a particular replacement): We can compare this to their Biden market… …and find that once again, they expect replacing Biden to go better (though I think 51% is just cope). At the Manifest prediction market conference in early June, I interviewed Nate Silver: …and asked him for his probability that the Democrats would win this election, versus his probability that the Democrats would win conditional on Biden not being the nominee (specifically “drops dead tomorrow of natural causes”). He said 40-45% chance normally, 50% chance without Biden. This was before the debate, but I think it matches the markets’ opinion that switching candidates would help the Democrats’ chances - and this has only become more true since the debate. On the other hand, polls asking people how they would vote in possible matchups don’t show any advantage of alternate candidates over Biden. Here’s the only post-debate poll I could find: And if Biden does need to be replaced, Democrats mostly support Harris, who the prediction markets find least promising: Maybe Democrats are the wrong people to ask - they’re already going to vote Biden, so you want someone who’s more attractive to independents. Of course, in a normal primary it would be Democrats making the decision. But if elites are going to do something behind closed doors, maybe they should take advantage and choose the candidate most likely to win, for once. I think these polls are the strongest objection to the prediction markets’ verdict. You could make an argument where prediction market users are mostly educated liberal white males, and even though they’re incentivized to honestly determine what ordinary people think, they’re too out-of-touch with ordinary people to do so effectively. Or they might be over-fixating on “voters don’t like Biden’s senility” without considering that, even if voters didn’t know Biden was currently senile before Thursday, they probably guessed that he would become senile sometime in his four-year term, and had basically accepted that his aides would do the hard work. Maybe they prefer a well-known likeable incumbent over an unknown quantity (and the unknown quantity’s potential new/weird aides), even if the well-known likeable incumbent is senile. Maybe elites know more than we do about how hard it is to inject a new candidate at the last moment, how dangerous it is to have someone who hasn’t been thoroughly vetted for scandals, et cetera. Still, for now I trust the prediction markets. I think replacing Biden would add ~10 prcentage points to the Democrats’ chance of victory. At the end of this post, I’ll list the prediction markets I’m using as sources. But before then, a brief interlude of: Fuzzy Subjective Human Factors I Am Not Really Qualified To Talk About Many people on Twitter are asking “how could anyone possibly have been stupid enough to not realize that Biden was senile?” I was that stupid. I didn’t say it openly, because I’m at least smart enough to have a high threshold for giving my opinion on political things I don’t know much about. But I thought it in my heart. So in case the people asking “how could anyone have been that stupid?” actually want an explanation, here’s my former reasoning. Republicans have been accusing Biden of being senile (and the Democrats of hiding it) for at least five years now. Before the 2020 debates, they were excited that this was when they could finally prove once and for all that Biden was senile. Then Biden did fine, and they retreated to “well he’s senile but they have some secret drug they’re giving him, just during debates, that makes him look fine”. Notice this is from 2020; according to polls, he did win the debate that year (source) I think a lot about experimental cognitive enhancement drugs, and I can say with confidence that nothing like that exists. Stimulants can help people with mild dementia be more active and motivated, but they don’t really improve cognition directly, and they can’t make a demented person temporarily lucid. Still, for the past four years, every time Biden was going to do something - a press conference, a State of the Union, whatever - the Republicans would say “ha, this time is going to be the proof that he’s senile!” And then he would always do fine, and they would retreat back to “I guess he used the secret drug this time too”. The satire site Babylon Bee had some funny articles about this: Babylon Bee, after Biden gave a good State of the Union speech earlier this year. Meanwhile, the Democrats were spreading the alternate narrative that Trump was senile. This one has gotten less press, because I don’t know how many people really believed it. But it came up occasionally, along with out-of-context video snippets where Trump said or did something dumb or meandering. Of course, anybody with a presidential candidate’s level of public exposure will have a few gaffes. Even if they don’t, you can always deceptively crop something so it looks like they did. Wait, why is a psychoanalyst getting quoted as a top expert in dementia? (source) I didn’t know you could diagnose someone via Change.org petition, but 2544 people who claim to be licensed professionals can’t be wrong! So with the constant attempts to prove that both candidates were senile, the constant demonstration by both candidates that they weren’t, and the constant retreat into conspiracy theories of “I guess he used the magic drug again but we’ll get him next time!”, I just tuned out this entire category of thing. And I guess I kept it tuned out longer than I should have, whoops. Reversed stupidity is not intelligence. Even if liars are saying something for their usual liar reasons, it can still be true. For twenty years, people spread false rumors that Castro was on his deathbed, but this didn’t make Castro immortal. In the same way, I should have figured out that even if I couldn’t trust any particular claim that Biden was senile, the prior for an 81 year old becoming senile was still high. But I guess I assumed that if he was becoming senile, some Democratic elites would have secret knowledge about it, and they couldn’t possibly be so stupid as to deny it while also scheduling him for a debate where it would inevitably come out. So I figured the Democratic elites who were closest to him thought he was doing well, and I trusted them more than the people who had been wrong every time for the past five years. I’m still confused what those elites were thinking. Reading the news coverage for the past few days (including some video clips from a post-debate rally where he seemed noticeably better) it seems like some combination of: He has good days and bad days, and they were hoping this would be a good day.
September 17, 2024 · Original source
…okay, that wasn’t fun or interesting either. Also, it’s really hard (there are a lot more new posts than old ones). But I bet it’ll be fun to try the same thing a year or so after the election. Polymarket Is Rolling In Cash We talk about a lot of topics here. AI forecasters. Brier scores. Fixing science. But the average person is in forecasting for one thing: betting on presidential elections.
Here’s Polymarket’s volume (in dollars bet) over time (source):
Even a 1% fee on all this trading would make Polymarket a lot of money. But they . . . don’t really seem to charge fees? According to Forbes (paywalled):
November 05, 2024 · Original source
Polymarket’s Wild Ride On October 14th, Polymarket gave Donald Trump 54% odds of winning, compared to Nate Silver’s 49% and Metaculus’ 45%. Whatever, everyone knows Polymarket has a small right-wing bias, and 5% isn’t too bad. Three days later, it had risen from 54% to 61%, despite no news and no change for Metaculus or Nate, bringing the Polymarket/Silver spread to an unprecedented 11%. What happened? This is the rare prediction market story where the answers are already in the New York Times and the Wall Street Journal: one really rich guy put $30 million on Trump (a recent followup by Jorge Velez claims it’s actually more like $75 million). Although he prefers to remain anonymous, reporters have talked to him and are able to reveal that he’s French, goes by “Theo”, is a former banker, and has no insider connections. He just a normal rich guy who really thinks Trump will win. This is exactly the sort of shock that prediction markets are supposed to be resilient against. Instead, the market stayed at 61% for days, swung even higher for a while, finally fell back down two weeks later, then went back up again. What happened? The simplest story would be insufficient liquidity: there just weren’t enough people to gather the $75 million it would take to bet against Theo. This is superficially plausible: Polymarket requires crypto and bans Americans, so the mispricing couldn’t be corrected until enough crypto-literate, American-election-following foreigners showed up to bet $75 million. That’s a tall order, and maybe it took two weeks. But the simple story seems wrong. Other real-money markets rose approximately in tandem with Polymarket. For example, Smarkets got to Trump 59% on 10/16, and peaked at 64% on 10/30. Kalshi followed a similar path. Both tracked Polymarket, not Nate Silver or Metaculus (neither of whom ever went above Trump 55% since Harris joined the race). So I think the remaining stories are: Theo made his giant bet on Polymarket. By coincidence, at the same time, bettors everywhere massively overcounted a few good polls for Trump and started a feeding frenzy on pro-Trump shares. This made all other markets gain, and Polymarket stay at its Theo-caused peak, until a few bad polls for Trump brought everyone back to reality last week.
On October 14th, Polymarket gave Donald Trump 54% odds of winning, compared to Nate Silver’s 49% and Metaculus’ 45%. Whatever, everyone knows Polymarket has a small right-wing bias, and 5% isn’t too bad. Three days later, it had risen from 54% to 61%, despite no news and no change for Metaculus or Nate, bringing the Polymarket/Silver spread to an unprecedented 11%. What happened? This is the rare prediction market story where the answers are already in the New York Times and the Wall Street Journal: one really rich guy put $30 million on Trump (a recent followup by Jorge Velez claims it’s actually more like $75 million). Although he prefers to remain anonymous, reporters have talked to him and are able to reveal that he’s French, goes by “Theo”, is a former banker, and has no insider connections. He just a normal rich guy who really thinks Trump will win. This is exactly the sort of shock that prediction markets are supposed to be resilient against. Instead, the market stayed at 61% for days, swung even higher for a while, finally fell back down two weeks later, then went back up again. What happened? The simplest story would be insufficient liquidity: there just weren’t enough people to gather the $75 million it would take to bet against Theo. This is superficially plausible: Polymarket requires crypto and bans Americans, so the mispricing couldn’t be corrected until enough crypto-literate, American-election-following foreigners showed up to bet $75 million. That’s a tall order, and maybe it took two weeks. But the simple story seems wrong. Other real-money markets rose approximately in tandem with Polymarket. For example, Smarkets got to Trump 59% on 10/16, and peaked at 64% on 10/30. Kalshi followed a similar path. Both tracked Polymarket, not Nate Silver or Metaculus (neither of whom ever went above Trump 55% since Harris joined the race). So I think the remaining stories are: Theo made his giant bet on Polymarket. By coincidence, at the same time, bettors everywhere massively overcounted a few good polls for Trump and started a feeding frenzy on pro-Trump shares. This made all other markets gain, and Polymarket stay at its Theo-caused peak, until a few bad polls for Trump brought everyone back to reality last week.
Theo made his giant bet on Polymarket. By coincidence, at the same time, bettors everywhere massively overcounted a few good polls for Trump and started a feeding frenzy on pro-Trump shares. This made all other markets gain, and Polymarket stay at its Theo-caused peak, until a few bad polls for Trump brought everyone back to reality last week.
November 07, 2024 · Original source
Polymarket (and prediction markets in general) had an amazing Election Night. They called states impressively early and accurately, kept the site stable through what must have been incredible strain, and have successfully gotten prediction markets in front of the world (including the Trump campaign). From here it’s a flywheel; victory building on victory. Enough people heard of them this election that they’ll never lack for customers. And maybe Trump’s CFTC will be kinder than Biden’s and relax some of the constraints they’re operating under. They’ve realized the long-time rationalist dream of a widely-used prediction market with high volume, deserve more praise than I can give them here, and I couldn’t be happier with their progress.
This is equivalent to the implicit argument between Polymarket and a group of other forecasting sites, especially Metaculus.
Just before the election, Polymarket and other real-money prediction markets said Trump had a 60% chance of winning. Metaculus and other non-money forecasting sites said he had a 50% chance of winning.
November 11, 2024 · Original source
2: Comments on things related to last week’s Polymarket post: Michael Wiebe argues that Theo the French whale’s contrarian polling take was based on a simple misunderstanding of how to read polls (I stand by my claim that a single success demonstrates almost nothing about the trustworthiness of the underlying process). And Alex Tabarrok says more about why this was a big victory for prediction markets.
January 20, 2025 · Original source
This year I had hoped to arrange some kind of fair comparison with Polymarket so I could prove my thesis that it usually underperforms Metaculus - but with all the excitement of the election and the feds harassing Shayne we never got around to making it work.
October 13, 2025 · Original source
Charlie Molthrop, $5K, for “normie-friendly prediction market interfaces”. Charlie has already made some tools for visualizing Manifold and Polymarket results; for example, a bot that tweets sudden dramatic changes on important Manifold questions.
January 13, 2026 · Original source
For a few weeks in October, Polymarket founder Shayne Coplan was the world’s youngest self-made billionaire (now it’s some AI people). Kalshi is so accurate that it’s getting called a national security threat.
For a few weeks in October, Polymarket founder Shayne Coplan was the world’s youngest self-made billionaire (now it’s some AI people). Kalshi is so accurate that it’s getting called a national security threat. The catch is, of course, that it’s mostly degenerate gambling, especially sports betting. Kalshi is 81% sports by monthly volume. Polymarket does better - only 37% - but some of the remainder is things like this $686,000 market on how often Elon Musk will tweet this week - currently dominated by the “140 - 164 times” category. (ironically, this seems to be a regulatory difference - US regulators don’t mind sports betting, but look unfavorably on potentially “insensitive” markets like bets about wars. Polymarket has historically been offshore, and so able to concentrate on geopolitics; Kalshi has been in the US, and so stuck mostly to sports. But Polymarket is in the process of moving onshore; I don’t know if this will affect their ability to offer geopolitical markets) Degenerate gambling is bad. Insofar as prediction markets have acted as a Trojan Horse to enable it, this is bad. Insofar as my advocacy helped make this possible, I am bad. I can only plead that it didn’t really seem plausible, back in 2021, that a presidential administration would keep all normal restrictions on sports gambling but also let prediction markets do it as much as they wanted. If only there had been some kind of decentralized forecasting tool that could have given me a canonical probability on this outcome! Still, it might seem that, whatever the degenerate gamblers are doing, we at least have some interesting data. There are now strong, minimally-regulated, high-volume prediction markets on important global events. In this column, I previously claimed this would revolutionize society. Has it? I don’t feel revolutionized. Why not? The problem isn’t that the prediction markets are bad. There’s been a lot of noise about insider trading and disputed resolutions. But insider trading should only increase accuracy - it’s bad for traders, but good for information-seekers - and my impression is that the disputed resolutions were handled as well as possible. When I say I don’t feel revolutionized, it’s not because I don’t believe it when it says there’s a 20% chance Khameini will be out before the end of the month. The several thousand people who have invested $6 million in that question have probably converged upon the most accurate probability possible with existing knowledge, just the way prediction markets should. I actually like this. Everyone is talking about the protests in Iran, and it’s hard to gauge their importance, and knowing that there’s a 20% chance Khameini is removed by February really does help to place them in context. The missing link seems to be between “it’s now possible to place global events in probabilistic context → society revolutionized”. Here are some possibilities: Maybe people just haven’t caught on yet? Most news sources still don’t cite prediction markets, even when many people would care about their outcome. For example, the Khameini market hasn’t gotten mentioned in articles about the Iran protests, even though “will these protests succeed in toppling the regime?” is the obvious first question any reader would ask. Maybe the problem is that probabilities don’t matter? Maybe there’s some State Department official who would change plans slightly over a 20% vs. 40% chance of Khameini departure, or an Iranian official for whom that would mean the difference between loyalty and defection, and these people are benefiting slightly, but not enough that society feels revolutionized. Maybe society has been low-key revolutionized and we haven’t noticed? Very optimistically, maybe there aren’t as many “obviously the protests will work, only a defeatist doomer traitor would say they have any chance of failing!” “no, obviously the protests will fail, you’re a neoliberal shill if you think they could work” takes as there used to be. Maybe everyone has converged to a unified assessment of probabilistic knowledge, and we’re all better off as a result. Maybe Polymarket and Kalshi don’t have the right questions. Ask yourself: what are the big future-prediction questions that important disagreements pivot around? When I try this exercise, I get things like: Will the AI bubble pop? Will scaling get us all the way to AGI? Will AI be misaligned?
I don’t feel revolutionized. Why not? The problem isn’t that the prediction markets are bad. There’s been a lot of noise about insider trading and disputed resolutions. But insider trading should only increase accuracy - it’s bad for traders, but good for information-seekers - and my impression is that the disputed resolutions were handled as well as possible. When I say I don’t feel revolutionized, it’s not because I don’t believe it when it says there’s a 20% chance Khameini will be out before the end of the month. The several thousand people who have invested $6 million in that question have probably converged upon the most accurate probability possible with existing knowledge, just the way prediction markets should. I actually like this. Everyone is talking about the protests in Iran, and it’s hard to gauge their importance, and knowing that there’s a 20% chance Khameini is removed by February really does help to place them in context. The missing link seems to be between “it’s now possible to place global events in probabilistic context → society revolutionized”. Here are some possibilities: Maybe people just haven’t caught on yet? Most news sources still don’t cite prediction markets, even when many people would care about their outcome. For example, the Khameini market hasn’t gotten mentioned in articles about the Iran protests, even though “will these protests succeed in toppling the regime?” is the obvious first question any reader would ask. Maybe the problem is that probabilities don’t matter? Maybe there’s some State Department official who would change plans slightly over a 20% vs. 40% chance of Khameini departure, or an Iranian official for whom that would mean the difference between loyalty and defection, and these people are benefiting slightly, but not enough that society feels revolutionized. Maybe society has been low-key revolutionized and we haven’t noticed? Very optimistically, maybe there aren’t as many “obviously the protests will work, only a defeatist doomer traitor would say they have any chance of failing!” “no, obviously the protests will fail, you’re a neoliberal shill if you think they could work” takes as there used to be. Maybe everyone has converged to a unified assessment of probabilistic knowledge, and we’re all better off as a result. Maybe Polymarket and Kalshi don’t have the right questions. Ask yourself: what are the big future-prediction questions that important disagreements pivot around? When I try this exercise, I get things like: Will the AI bubble pop? Will scaling get us all the way to AGI? Will AI be misaligned?
March 03, 2026 · Original source
A coarser yes-no Polymarket tells the same story:
A coarser yes-no Polymarket tells the same story: The chance of Anthropic getting a $500 billion+ valuation in 2026 fell from 90% to 76%, before rebounding to 83%.
The chance of Anthropic getting a $500 billion+ valuation in 2026 fell from 90% to 76%, before rebounding to 83%.