Turing test is a recurring concept in the Astral Codex Ten archive, appearing 7 times across 7 issues between April 14, 2022 and November 20, 2024. The archive places it in contexts such as "will plant-based meat pass a “Turing test”"; "Imagine that a computer program passed this test for the first time"; "When the program ELIZA passed the Turing test". It most often appears alongside Sam Altman, Alan Turing, ChatGPT.
- Article page
- Turing test
- Mention count
- 7
- Issue count
- 7
- First seen
- April 14, 2022
- Last seen
- November 20, 2024
- https://abc7news.com/post/graffiti-in-san-francisco-tagging-vandalism-street/13801629/
- https://ai-art-turing-test.com/
- https://mark---lawrence.blogspot.com/2025/08/so-is-ai-writing-any-good-part-2.html?m=1
- https://substack.com/home/post/p-155182269
- https://what3words.com/guitars.record.caps
- https://www.astralcodexten.com/p/ai-art-turing-test
- https://www.astralcodexten.com/p/how-did-you-do-on-the-ai-art-turing
- https://www.astralcodexten.com/p/secrets-of-the-great-families
- https://www.experimental-history.com/p/ideological-turing-test
- https://www.statista.com/chart/21581/public-opinion-on-easing-stay-at-home-restrictions/
17: Metaculus: will plant-based meat pass a “Turing test” (where people can’t distinguish it from real meat) by 2023? Currently at 55%
We know this because it happened several times. The first time was in 1966, when ELIZA passed the Turing test. ELIZA was a chatbot who could fool some people to believe that they talk with a real human. Before ELIZA, people assumed that only an intelligent machine could do that, but it just turned out that it is really easy to fool others. Other tests for intelligence were playing chess, playing a whole variety of games, or recognizing cat images. Machines can do all this by now, and this is awesome. And yet, every success sparked new disappointment, because we didn't find any magic ingredient, some quality that would make a difference between intelligent and non-intelligent. When the groundbreaking GPT-3 and DALL-E suddenly could write news articles or poetry, or could dream up snails made of harp... the main improvement was that they used more raw computation power than the previous versions.
Do you feel disappointed by the book? At least some people did. When the program ELIZA passed the Turing test, a common reaction was “This is not what we had meant”. Some reactions to this book were similar. Dehaene’s concept of consciousness is much more mundane than the lofty associations that we commonly attach to the word consciousness. But if we actually want to nail it down and get a more substantial definition than “whatever elevates me above mere animal”, then “the thing that happens during conscious perceptions and does not happen during unconscious perceptions” sounds pretty convincing to me.
Imagine that there was a generally acknowledged test for artificial intelligence, to find out whether a computer program is truly intelligent. And imagine that a computer program passed this test for the first time. How would you feel about it?
The year is 2028, and this is Turing Test!, the game show that separates man from machine! Our star tonight is Dr. Andrea Mann, a generative linguist at University of California, Berkeley. She’ll face five hidden contestants, code-named Earth, Water, Air, Fire, and Spirit. One will be a human telling the truth about their humanity. One will be a human pretending to be an AI. One will be an AI telling the truth about their artificiality. One will be an AI pretending to be human. And one will be a total wild card. Dr. Mann, you have one hour, starting now.
AIR: The Turing Test! rules forbid asking contestants to write poetry. AIs can write poems in seconds, but humans can’t. It would make the game too easy.
MANN: And be forever known as the person who won Turing Test! with racial slurs? I had a spiritual experience once. It was on two hundred micrograms of acid. I still think there was something meaningful about it. I know AIs have been trained on every spiritual experience every human has ever written about online, but it still feels like the essence of a spiritual experience is something that can’t be put into words. It’s not like you’re leaving me many other options. I think maybe having an experience that can’t be put into words, and putting it into words, is subtly different from reading words about an experience that can’t be put into words, guessing what they’re pointing at, and then writing words about your guess. That’s the best way I can think of to defeat a language model.
A recent leak suggested that the cost of training GPT-4 was $63 million, which is already higher than the superforecasters’ median estimate of $35 million by 2024 has already been proven incorrect. I don’t know how many petaFLOP-days were involved in GPT-4, but maybe that one is already off also. There was another question on when an AI would pass a Turing Test. The superforecasters guessed 2060, the domain experts 2045. GPT-4 hasn’t quite passed the exact Turing Test described in the study, but it seems very close, so much so that we seem on track to pass it by the 2030s. Once again the experts look better than the superforecasters. So is it possible that we, in 2023, now have so much better insight into AI than the 2022 forecasters that we can throw out their results? We could investigate this by looking at Metaculus, a forecasting site that’s probably comparably advanced to this tournament. They have a question suspiciously similar to XPT’s global catastrophe framing: In summer 2022, the Metaculus estimate was 30%, compared to the XPT superforecasters’ 9% (why the difference? maybe because Metaculus is especially popular with x-risk-pilled rationalists). Since then it’s gone up to 38%. Over the same period, Metaculus estimates of AI catastrophe risk went from 6% to 15%. If the XPT superforecasters’ probabilities rose linearly by the same factor as Metaculus forecasters’, they might be willing to update total global catastrophe risk to 11% and AI catastrophe risk to 5%. But the main thing we’ve updated on since 2022 is that AI might be sooner. But most people in the tournament already agreed we would get AGI by 2100. The main disagreement was over whether it would cause a catastrophe once we got it. You could argue that getting it sooner increases that risk, since we’ll have less time to work on alignment. But I would be surprised if the kind of people saying the risk of AI extinction is 0.4% are thinking about arguments like that. So maybe we shouldn’t expect much change. FRI called back a few XPT forecasters in May 2023 to see if any of them wanted to change their minds, but they mostly didn’t. Overall I don’t think this was just a problem of the incentives being bad or the forecasters being stupid. This is a real, strong disagreement. We may be able to slightly increase their forecast based on recent events, but this would only change the estimate a little. Breaking Down The AI Estimate How did the forecasters arrive at their AI estimate? What were the cruxes between the people who thought AI was very dangerous, and the people who thought it wasn’t? You can think of AI extinction as happening in a series of steps: We get human-level AI by 2100.
Back in 1950, Alan Turing believed that an AI would surely be intelligent (“can a machine think?”) if it could appear human in conversation. Nobody has subjected modern LLMs to a full Turing Test, but nothing hinges on whether they do. LLMs either blew past the Turing Test without fanfare a year or two ago, or will do so without fanfare a year or two from now; either way, no one will care. Instead of admitting AI is truly intelligent, we’ll just admit that the Turing Test was wrong.
(and “a year or two from now” is being generous - a dumb chatbot passed a supposedly-official-albeit-stupid Turing Test attempt in 2014, and ELIZA was already fooling people in 1964.)
But as I’ve actually used some of the various technologies lumped together as “artificial intelligence,” over and over my reaction has been: “Jesus, this stuff is actually very powerful… and this is only the beginning.” I think many of my fellow leftists tend to have a dismissive attitude toward AI’s capabilities, delighting in its failures (ChatGPT’s basic math errors and “hallucinations,” the ugliness of much AI-generated “art,” badly made hands from image generators, etc.). There is even a certain desire for AI to be bad at what it does, because nobody likes to think that so much of what we do on a day-to-day basis is capable of being automated. But if we are being honest, the kinds of technological breakthroughs we are seeing are shocking. If I’m training to debate someone, I can ask ChatGPT to play the role of my opponent, and it will deliver a virtually flawless performance. I remember not too many years ago when chatbots were so laughably inept that it was easy to believe one would never be able to pass a Turing Test. Now, ChatGPT not only aces the test but is better at being “human” than most humans. And, again, this is only the start.
Alan Turing recommended that if 30% of humans couldn’t tell an AI from a human, the AI could be considered to have “passed” the Turing Test. By these standards, AI artists pass the test with room to spare; on average, 40% of humans mistook each AI picture for human.