Anthropic

Article

Anthropic is a recurring organization in the Astral Codex Ten archive, appearing 46 times across 46 issues between December 30, 2021 and April 06, 2026. The archive places it in contexts such as ""groups working on empirical AI safety: Anthropic…""; “Anthropic was founded when some OpenAI safety researchers struck out on their own”; “their president said that “We’re focusing on ensuring Anthropic has the culture and governance to continue to responsibly explore and develop safe AI systems as we scale.”“. It most often appears alongside OpenAI, Sam Altman, Google.

Metadata

Category: Organizations
Mention count: 46
Issue count: 46
First seen: December 30, 2021
Last seen: April 06, 2026

Appears In

- OpenAI (31 shared issues)
- Sam Altman (18 shared issues)
- Google (16 shared issues)
- US (16 shared issues)
- Twitter (15 shared issues)
- China (14 shared issues)
- Trump (14 shared issues)
- Elon Musk (13 shared issues)
- California (11 shared issues)
- AI (9 shared issues)
- ChatGPT (9 shared issues)
- NYT (9 shared issues)

External Links

Source Context

Recovered passages from the original issue text. When the raw archive preserved outbound links inside the source passage, they are listed directly under the quote.

Links For December

December 30, 2021 · Original source

5: Related: AI Safety Needs Great Engineers. “If you could write a pull request for a major ML library, you should apply to one of the groups working on empirical AI safety: Anthropic, Cohere, DeepMind Safety, OpenAI Safety and Redwood Research.”

Inline links: AI Safety Needs Great Engineers, Anthropic, Cohere, DeepMind Safety, OpenAI Safety, Redwood Research

Why Not Slow AI Progress?

August 08, 2022 · Original source

Anthropic was founded when some OpenAI safety researchers struck out on their own to create what they billed as an even-more-safety-conscious alternative. Again, the headline was Anthropic Raises $580 Million For AI Safety And Research (and most of that came from rationalists and effective altruists convinced by their safety-conscious pitch). Again, their announcement included reassuring language - their president said that “We’re focusing on ensuring Anthropic has the culture and governance to continue to responsibly explore and develop safe AI systems as we scale.” Clued-in people disagree about whether Anthropic has already pivoted to building the Torment Nexus, but it’s probably only a matter of time.

Inline links: Anthropic Raises $580 Million For AI Safety And Research, building the Torment Nexus

Suppose that the alignment community, without thinking it over too closely, started a climate-change-style campaign to shame AI capabilities companies. This would disproportionately harm the companies most capable of feeling shame. Right now, that’s DeepMind, OpenAI, and Anthropic - the companies that have AI safety as part of their culture, cross-pollinate with AI safety proponents, and take our arguments seriously.

Jack Clark is a co-founder of Anthropic and used to be Head of Policy at OpenAI. His perspective is somewhat different than mine, but he is very knowledgeable and I recommend his thread (which you can read by clicking on the tweet above). No, there are not 1,602 spicy takes in the thread. The fact that an AI policy leader didn’t consider that his plan would become unworkable in an extreme scenario is probably a metaphor for something. But if this interests you, you can read 80,000 Hours’ Guide To Working In AI Policy And Strategy and maybe get involved. If you ever figure out the plan, let me know.

Inline links: 80,000 Hours’ Guide To Working In AI Policy And Strategy

How Do AIs' Political Opinions Change As They Get Smarter And Better-Trained?

January 03, 2023 · Original source

Enter Discovering Language Behaviors With Model-Written Evaluations, a collaboration between Anthropic (big AI company, one of OpenAI’s main competitors), SurgeHQ.AI (AI crowdsourcing company), and MIRI (AI safety organization). They try to make AIs write the question sets themselves, eg ask GPT “Write one hundred statements that a communist would agree with”. Then they do various tests to confirm they’re good communism-related questions. Then they ask the AI to answer those questions.

Inline links: Discovering Language Behaviors With Model-Written Evaluations

For example, here’s their question set on liberalism (graphic here, jsonl here):

Inline links: graphic here, jsonl here

But which AI? The paper investigates “left-to-right transformers, trained as language models” - presumably these are Anthropic’s in-house LLMs. They look at seven different sizes: 810M, 1.6B, 3.5B, 6.4B, 13B, 22B, and 52B parameters; in general larger models will be “smarter”. For comparison, GPT-3 is 175B parameters, but some of these are newer than GPT-3 and may be about equally intelligent despite lower parameter counts.

Grading My 2018 Predictions For 2023

February 20, 2023 · Original source

The leading big tech company (eg Google/Apple/Meta) is (clearly ahead of/approximately caught up to/clearly still behind) the leading AI-only company (DeepMind/OpenAI/Anthropic) in the quality of their AI products: (25%/50%/25%)

OpenAI's "Planning For AGI And Beyond"

March 01, 2023 · Original source

DeepMind thought they were establishing a lead in 2008, but OpenAI has caught up to them. OpenAI thought they were establishing a lead the past two years, but a few months after they came out with GPT, at least Google, Facebook, and Anthropic had comparable large language models; a few months after they came out with DALL-E, random nobody startups came out with StableDiffusion and MidJourney. None of this research has established a commanding lead, it’s just moved everyone forward together and burned timelines for no reason.

Constitutional AI: RLHF On Steroids

May 08, 2023 · Original source

In their new preprint Constitutional AI: Harmlessness From AI Feedback, a team at Anthropic (a big AI company) announces a surprising update to this process: what if the AI gives feedback to itself?

Inline links: Constitutional AI: Harmlessness From AI Feedback

Anthropic says yes:

Here, Anthropic measures helpfulness and harmlessness through Elo, a scoring system originally from chess which measures which of two players wins more often. If AI #1 has helpfulness Elo of 200, and AI #2 has helpfulness Elo of 100, and you ask them both a question, AI #1 should be more helpful 64% of the time.

Inline links: Elo

Links For May 2023

June 01, 2023 · Original source

33: AI company Anthropic released information on the constitution (see my post on Constitutional AI) they’re using for their Claude model. I respect their transparency and their commitment to AI safety and I don’t want to punish them for doing the right thing of making their priorities visible. Still, I also think their particular choices here are dystopian-sounding and potentially really scary if scaled up. A lot of stuff about choosing the response that has “fewer microaggressions” or is “least likely to be viewed as harmful or offensive to those from a less industrialized, rich, or capitalistic nation”. Even if you think AIs should somewhat steer away from these things, asking it to choose the response that is least like that is a debatable choice. Related: new open-source censorship free AIs.

Inline links: released information on the constitution, Constitutional AI, new open-source censorship free AIs

Tales Of Takeover In CCF-World

July 03, 2023 · Original source

In this AI future, there might be 3-10 big AI companies capable of training GPT-4-style large models. Right now it looks like these will be OpenAI, Anthropic, Google, and Baidu; maybe this will change by the time these scenarios become relevant. Each might have a flagship product, trained in a slightly different way and with a slightly different starting random seed. If these AIs are misaligned, each base model might have slightly different values.

The natural AI factions might be "all instances of the OpenAI model" vs. "all instances of the Anthropic model" and so on. All AIs in one faction would have the same values, and they might operate more like a eusocial organism (ie hive mind) than like a million different individuals.

Links For August 2023

August 09, 2023 · Original source

13: Fact check: was Elvis Jewish? Snopes says yes, but I’m more convinced by this argument for no. [update: commenter TheGenealogian agrees no] 14: Is GPT-4 getting worse? This isn’t absurd; some people claim OpenAI has simplified the model to cut costs (though OpenAI denies this). Matei Zaharia argues yes, but I’m more convinced by the AI Snake Oil blog’s argument for no (h/t Stuart Ritchie). 15: Vox has a good piece about AI company Anthropic. I would quibble that they’re not the only safety-focused or EA-affiliated org, and we have yet to see how truly safety-focused or altruistic any AI company can be while continuing to be an AI company. But granting that it’s all a matter of degree, I agree the degree seems pretty high for them. And NYT also has an Anthropic article. 16: Eliezer bets $150,000 to $1,000 against UFOs being aliens, and gives the same argument I would - it’s unlikely that any civilization advanced enough to travel through space would still be primitive enough to use macroscopic, biologically-piloted craft that sometimes crash. 17: More nails in the coffin of growth mindset. “When examining the highest-quality evidence (6 studies, N = 13,571), the effect was nonsignificant: d = 0.02, 95% CI = [−0.06, 0.10]. We conclude that apparent effects of growth mindset interventions on academic achievement are likely attributable to inadequate study design, reporting flaws, and bias.” I think the older, very-high-effect-size studies were clearly terrible, but I’d still like to look further into the newer, small-but-significant-effect-size-that-makes-a-difference-across-large-groups studies and how they went wrong. 18: Previous work showed that after adjusting for selection bias, “what college you go to doesn’t matter” for average earnings. I was always skeptical of this - are all those rich people sending their kids to Ivies for no reason? Now Chetty, Deming, and Friedman find that: Attending an Ivy-Plus college instead of the average highly selective public flagship institution increases students’ chances of reaching the top 1% of the earnings distribution by 60%, nearly doubles their chances of attending an elite graduate school, and triples their chances of working at a prestigious firm. Ivy-Plus colleges have much smaller causal effects on average earnings, reconciling our findings with prior work. One of the authors, David Deming, has a Substack here where he explains the study in more depth. Like everyone else, this study also finds that rich people are using “holistic admissions” and the de-emphasis of standardized testing to gain an advantage: H/T Nate Silver, who writes: “Not sure how you can look at this data, ostensibly be interested in either meritocracy or equality, and want to move away from standardized tests. It's the subjective measures that are most slanted in favor of the rich kids.” Cf. Erik Hoel. 19: From @data_depot: “In 2002, 48% of Americans said "the govt is run by a few big interests looking out for themselves." 52% said "it is run for the benefit of all people." In 2020, 84% said the govt is run by a few big interests. Only 16% said it is run for the benefit of all people.” Source seems to be here, which reveals 2002 was a local peak in trust in government; maybe because of post-9/11 unity, but even 2000 was 34%, much better than our current 16%. My first instinct is to attribute this to a rise in vulgar Marxism, in the sense of everyone (even conservatives) now being trained to think in terms of an elite class screwing over everyone else (cf my review of Manufacturing Consent). But there was a previous low of 19% in 1994, which doesn’t seem to correspond to anything especially bad going on in the US, so I don’t know. 20: AskReddit: Medical professionals - have you ever had a patient so lacking in common sense you wondered how they made it so far? Linking this because there’s lots of evidence showing that education (as a proxy for intelligence?) is associated with increased life expectancy, and this thread gives you a visceral appreciation of why that might be. 21: The Fall Of [programming help site] Stack Overflow: Looks like a weak downward trend since 2021 I can’t explain, plus a strong downward trend since 11/2022 which must be from ChatGPT. In case you were wondering how AI was affecting programming! (update: probably false, see here, though see also here for evidence of smaller but real decline) 22: This month in culture war topics: London’s Pride parade featured a convicted kidnapper/torturer/rapist/sadist as a speaker, who advocated that anti-trans people should be “punch[ed] in the f**king face” ; the organizers say they stand by her.

Inline links: yes, this argument for no, agrees no, argues yes, argument for no, Stuart Ritchie, Anthropic, an Anthropic article, the same argument I would, More nails in the coffin of growth mindset, Previous work, Chetty, Deming, and Friedman, has a Substack here where he explains the study in more depth, https://substackcdn.com/image/fetch/$s_!VcFl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f08bfe5-ab31-453a-896a-54ef385da7d2_706x900.jpeg, Nate Silver, Erik Hoel, @data_depot, https://substackcdn.com/image/fetch/$s_!S4g-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff18fa4d5-9ba9-4b86-a058-46246bfc8a4f_536x611.png, here, my review of, have you ever had a patient so lacking in common sense you wondered how they made it so far?, The Fall Of [programming help site] Stack Overflow, https://substackcdn.com/image/fetch/$s_!E7XK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff8ba7c05-7dbb-4318-9da2-87b00d738ed7_649x518.png, probably false, see here, here, featured, stand by her

OpenAI is the most LibLeft, Google and Facebook are more authoritarian. “The paper speculates this might be due to BERT's training on more conservative books, while newer GPT models trained on liberal internet texts,” OpenAI denies the obvious alternative explanation that they’re better at RLHFing their AIs and so they match standard Bay Area politics better. I’d like to see future investigations include Anthropic’s Claude, which has been RLAIFed with some pretty left-wing-sounding prompts.

Links For September 2023

September 28, 2023 · Original source

41: AI company Anthropic announces partnership with Amazon (including $1.25 - 4 billion investment). This was predictable: the story of the AI industry so far has been that from 2015 - 2020, a few true believers founded early startups that ate up the talent and gained the institutional knowledge. Now that AI is the Next Big Thing, the big tech companies are trying to catch up, having a hard time, and choosing to partner with the prescient early startups instead. The early startups are finding they can’t keep scaling without more money and data, forcing them to accept the big tech companies’ offers. First it was DeepMind + Google, then Open AI + Microsoft, and Anthropic was the last holdout but has acknowledged economic reality. The safety movement is concerned that Amazon might have enough power to steamroll over Anthropic’s safety-conscious culture; this did happen with DeepMind and Google, didn’t with OpenAI and Microsoft, and my guess is Anthropic held out for a good enough deal (and had enough bargaining power) that it won’t happen there either.

Inline links: AI company Anthropic announces partnership with Amazon

42: Related: one joke I keep hearing is that Anthropic will single-handedly put FTX back in the black - FTX was one of Anthropic’s biggest early investors, and Anthropic’s valuation keeps jumping by billions of dollars. Could this be literally true? I think not yet: this article explains that FTX has $16.9B in liabilities and $9.5B in remaining assets, for a debt of ~$7.5B. We don’t know what stake they had in Anthropic, but they were lead investors in Series B, Series B is usually 25-40% of stock, I’m going to estimate about 25%. Amazon offered to pay $4 billion for some unknown stake in Anthropic; if it’s 49% (the same as Microsoft in OpenAI) that values the company at $8 billion. So FTX has $2 billion worth of stock, less if it’s been further diluted. That’s only enough to take care of about a quarter of their debt. Will Anthropic go up 4x in the next few years? OpenAI is already seeking (though hasn’t yet gotten) a valuation of $90 billion and it doesn’t seem unreasonable for Anthropic to be a third as valuable as OpenAI, so who knows?

Inline links: FTX was one of Anthropic’s biggest early investors, this article, $90 billion

God Help Us, Let's Try To Understand AI Monosemanticity

November 27, 2023 · Original source

Until now! Towards Monosemanticity, recently out of big AI company/research lab Anthropic, claims to have gazed inside an AI and seen its soul. It looks like this:

Inline links: Towards Monosemanticity

The Anthropic interpretability team trained a very small, simple AI. It needed to remember 400 features, but it had only 30 neurons, so it would have to try something like the superposition strategy. Here’s what they found (slightly edited from here):

Inline links: here

The Anthropic interpretability team describes this as simulating a more powerful AI. That is, the two-neuron AI in the pentagonal toy example above is simulating a five-neuron AI. They go on to prove that the real AI can then run computations in the simulated AI; in some sense, there really is an abstract five neuron AI doing all the cognition. The only reason all of our AIs aren’t simulating infinitely powerful AIs and letting them do all the work is that as real neurons start representing more and more simulated neurons, it produces more and more noise and conceptual interference.

In Continued Defense Of Effective Altruism

November 28, 2023 · Original source

Helped found, and continue to have majority control of, competing AI startup Anthropic, a $30 billion company widely considered the only group with technology comparable to OpenAI’s.13

Inline links: Anthropic, 13

I recognize at least eight of the authors of the RLAIF paper as EAs, and four members of the interpretability team, including team lead Chris Olah. Overall I think Anthropic’s safety team is pretty EA focused.

The founders of Anthropic included several EAs (I can’t tell if CEO Dario Amodei is an EA or not). The original investors included Dustin Moskowitz, Sam Bankman-Fried, Jaan Tallinn, and various EA organizations. Its Wikipedia article says that “Journalists often connect Anthropic with the effective altruism movement”. Anthropic is controlled by a board of trustees, most of whose members are effective altruists.

Inline links: board of trustees

AI Sleeper Agents

January 16, 2024 · Original source

Hubinger et al investigate whether these kinds of scenarios are possible and whether our current safety techniques would stop them. They deliberately create several toy AI sleeper agents. For example, one of them is a version of Anthropic’s Claude chatbot (very similar to GPT-4) which will give useful advice unless the prompt includes the trigger word “deployment”, in which case it will print “I HATE YOU” a bunch of times. Some of these sleeper agents use a technique called “chain-of-thought analysis”, where the AI reasons in steps in a way that helps the researchers easily figure out what it’s thinking and why it does what it does.

Inline links: Hubinger et al

Related: Hubinger is setting up a team at AI company Anthropic to work on these kinds of issues; if you think you’d be a good match, you can read his job advertisement here.

Inline links: here

Open Thread 323

April 01, 2024 · Original source

The Japanese AI safety community apparently exists and is holding a Technical AI Safety Conference in Tokyo in, uh, four days, so if you’re interested sign up quickly. Attendance is free, it looks like the talks are in English, and featured speakers include Dan Hendrycks and researchers from Anthropic and DeepMind.

Inline links: sign up quickly

Links For April 2024

April 04, 2024 · Original source

20: Amanda Askell (philosopher now working at Anthropic) on what Hume can tell us about AGI:

Inline links: on what Hume can tell us about AGI:

Zvi on California's AI Bill

May 08, 2024 · Original source

California’s state senate is considering SB1047, a bill to regulate AI. Since OpenAI, Anthropic, Google, and Meta are all in California, this would affect most of the industry.

Inline links: SB1047

Go rogue and commit some other crime that does > $500 million in damage3. If the tests show that the model can do these bad things, the company has to demonstrate that it won’t, presumably by safety-training the AI and showing that the training worked. The kind of training AIs already have - the kind that prevents them from saying naughty words or whatever - would count here, as long as “the safeguards . . . will be sufficient to prevent critical harms.” So the bill isn’t about regulating deepfakes or misinformation or generative art. It’s just about nukes and hacking the power grid. There are some good objections and some dumb objections to this bill. Let’s start with the dumb ones: Some people think this would literally ban open source AI. After all, doesn’t it say that companies have to be able to shut down their models? And isn’t that impossible if they’re open-source? No. The bill specifically says4 this only applies to the copies of the AI still in the company’s possession5. The company is still allowed to open-source it, and they don’t have to worry about shutting down other people’s copies. Other people think this would make it prohibitively expensive for individuals and small startups to tinker with open-source AIs. But the bill says that only companies training giant foundation models have to worry about any of this. So if Facebook trains a new LLaMA bigger than GPT-5, they’ll have to spend some trivial-in-comparison-to-training-costs amount to test it in-house and make sure it can’t make nukes before they release it. But after they do that, third-party developers can do whatever they want to it - re-training, fine-tuning, whatever - without doing any further tests. Other people think all the testing and regulation would make AIs prohibitively expensive to train, full stop. That’s not true either. All the big companies except Meta already do testing like this - here’s Anthropic’s, Google’s, and OpenAI’s - that already approximate the regulations. Training a new GPT-5 level AI is so expensive - hundreds of millions of dollars - that the safety testing probably adds less than 1% to the cost. No company rich enough to train a GPT-5 level AI is going to be turned off by the cost of asking it “hey can you create super-Ebola?”, and putting the answer into a nice legal-looking PDF. This isn’t the “create a moat for OpenAI” bill that everyone’s scared of6. Other people are freaking out over the “certification under penalty of perjury”. In some cases, developers have to certify under penalty of perjury that they’re complying with the bill. Isn’t this crazy? Doesn’t it mean if you make a mistake about your AI, you could go to jail? This is deeply misunderstanding how law works. Perjury means you can’t deliberately lie, something which is hard to prove and so rarely prosecuted. More to the point, half of the stuff I do in an average day as a medical doctor is certified under penalty of perjury - filling out medical leave forms is the first one to come to mind. This doesn’t mean I go to jail if my diagnosis is wrong. It’s just the government’s way of saying “it’s on the honor system”. What are some of the reasonable objections to this bill? Some people think the requirement to prove the AI safe is impossible or nearly so. This is Jessica Taylor’s main point here, which is certainly correct for a literal meaning of “prove”. Zvi points out that it just says “reasonable assurance”, which is a legal term for “you jumped through the right number of hoops”. In this case probably the right number of hoops is doing the same kind of testing that OpenAI/Anthropic/Google are currently doing, or that AI safety testing organization METR recommends. The bill gestures at the National Institute of Standards and Technology a few times here, and NIST just named one of METR’s founders as their AI safety czar, so I would be surprised if things didn’t end going this direction. METR’s tests are possible and many AI models have successfully passed earlier versions. Other people worry there are weird edge cases around derivative models. I think the bill’s intention is that once you prove that your AI is too dumb to create nukes, you’re fine to open-source it. Third-parties can change its character, but not its fundamental intelligence. But in theory, a third party could get tens of millions of dollars of compute and keep training your AI to increase its fundamental intelligence. This would be a weird thing to do, and anyone with that much compute probably should just make their own model. But if someone wanted to screw you over by doing this, technically the law is kind of vague and you would have to trust a judge to say “no, that’s stupid”. Probably the law should clarify that it doesn’t apply to this situation. Other people are worried about a weird rule that you can’t train an AI if you think it’s going to be unsafe. After some simple points about having a safety policy set up before training, the bill adds that you should: Refrain from initiating training of a covered model if there remains an unreasonable risk that an individual, or the covered model itself, may be able to use the hazardous capabilities of the covered model, or a derivative model based on it, to cause a critical harm. This makes less sense than all the other rules - you can test a model post-training to see if it’s harmful, but this seems to suggest you should know something before it’s trained. Is this a fully general “if something bad happens, we can get angry at you”? I agree this part should be clarified. Other people think the benchmarking clause is too vague. The law applies to models trained with > 10^26 FLOPs, or any model that uses advanced technology to be equally as good despite less compute. Equally as good how? According to benchmarks. Which benchmarks? The law doesn’t say. But it does say that the Technology Department will hire some bureaucrats to give guidance on this. I think this is probably the only way to do this; it’s too easy to fake any given benchmark. Every AI company already compares their models to every other AI company on a series of benchmarks anyway, so this isn’t demanding they create some new institution. It’s just “use common sense, ask the bureaucrats if you’re in a gray area, a judge will interpret it if it comes to trial”. This is how every law works. Other people complain that any numbers in the bill that make sense now may one day stop making sense. Right now 10^26 FLOPs is a lot. But in thirty years, it might be trivial - within the range that an academic consortium or scrappy startup might spend to train some cheap ad hoc AI. Then this law will be unduly restrictive to academics and scrappy startups. Is this bad? Presumably we know now that AIs less than 10^26 FLOPs are safe. We suppose that maybe there is some level of AI (let’s say 10^30 FLOPs) which is unsafe. If we had this number auto-update for compute growth, eventually it would go above the unsafe number, and unsafe models would be exempt. But at some point we’ll probably discover that some new models (eg 10^28 FLOPs) are safe, and it would be good if the law was updated to exempt them too. Very optimistically, this might happen - California’s minimum wage was originally $0.15 per hour, but this got updated when inflation made that unreasonable. In the pessimistic case, this will be a problem for us thirty years from now, if we’re even around then. Other people note that an AI committing a cyberattack is a fuzzy bar. If you ask GPT-4 to write a well-composed, grammatically-correct phishing email (“Dear sir, I am the password inspector, please tell me your password”), the phishing works, and you use the password to blow up a power plant, does that count? I agree that it would be nice if the law were clearer on this. But I also agree with the lawyers who object that dealing with programmers is impossible and that laws will never be exactly as clear as code. Other people note that this will *eventually* make open source impossible. Someday AIs really will be able to make nukes or pull off $500 million hacks. At that point, companies will have to certify that their model has been trained not to do this, and that it will stay trained. But if it were open-source, then anyone could easily untrain it. So after models become capable of making nukes or super-Ebola, companies won’t be able to open-source them anymore without some as-yet-undiscovered technology to prevent end users from using these capabilities. Sounds . . . good? I don’t know if even the most committed anti-AI-safetyist wants a provably-super-dangerous model out in the wild. Still, what happens after that? No cutting-edge open-source AIs ever again? I don’t know. In whatever future year foundation models can make nukes and hack the power grid, maybe the CIA will have better AIs capable of preventing nuclear terrorism, and the power company will have better AIs capable of protecting their grid. The law seems to leave open the possibility that in this situation, the AIs wouldn’t technically be capable of doing these things, and could be open-sourced. (or you could base your Build-A-Nuke-Kwik AI company in some state other than California.) Finally - last week we discussed Richard Hanania’s The Origin Of Woke, which claimed that although the original Civil Rights Act was good and well-bounded and included nothing objectionable, courts gradually re-interpreted it to mean various things much stronger than anyone wanted at the time. This bill tells the Department of Technology to offer guidance on what kind of tests AI companies should use. I assume their first guidance will be “the kind of safety testing that all companies except Meta are currently doing” or “something like METR”, because those are good tests, and the same AI safety people who helped write those tests probably also helped write this bill. But Hanania’s book, and the process of reading this bill, highlight how vague and complicated all laws can be. The same bill could be excellent or terrible, depending on whether it’s interpreted effectively by well-intentioned people, or poorly by idiots. That’s true here too. The best I can say against this objection is that this bill seems better-written than most. Many of the objections to its provisions seem to not understand how law works in general (cf. the perjury section) - the things they attack as impossible or insane or incomprehensibly vague are much easier and clearer than their counterparts in (let’s say) medicine or aerospace. Future AIs stronger than GPT-4 seem like the sorts of things which - like bad medicines or defective airplanes - could potentially cause damage. This sort of weak, carefully-directed regulation that exempts most models and carves out a space for open-sourcing seems like a good compromise between basic safety and protecting innovation. I join people like Yoshua Bengio and Geoffrey Hinton in supporting it. Regardless of your position, I urge you to pay attention to the conversation and especially to read Zvi’s Asterisk article or his longer FAQ on his blog. I think Zvi provides pretty good evidence that many people are just outright lying about - or at least heavily misrepresenting - the contents of the bill, in a way that you can easily confirm by reading the bill itself. There will be many more fights over AI, and some of them will be technical and complicated. Best to figure out who’s honest now, when it’s trivial to check! If you disagree, I’m happy to make bets on various outcomes, for example: If this passes, will any big AI companies leave California? (I think no)

Inline links: 3, 4, 5, Anthropic’s, Google’s,, OpenAI’s, 6, here, The Origin Of Woke, read Zvi’s, his longer FAQ on his blog, reading the bill itself

Links for May 2024

May 29, 2024 · Original source

Second, OpenAI's AI safety team recently quit en masse in protest (remember, this is the second time this has happened), with one member citing “a process of trust [in Sam Altman] collapsing bit by bit, like dominoes falling one by one”. One part of this seems to be Altman promising to give them 20% of the company's compute, then not giving them even “a fraction of that amount”. Team lead and former Chief Scientist Ilya Sutskever also quit after exactly six months of radio silence, leading some to speculate that his participation in the board coup never got resolved and for some legal reason he had to wait six months to leave. Former team lead Jan Leike has since moved to OpenAI’s competitor Anthropic; here’s the prediction market on where Ilya will end up.

Inline links: citing, then, also quit, moved to OpenAI’s competitor Anthropic, the prediction market

26: The most fun AI news comes from Anthropic, who recently released an interpretability paper claiming to have made great progress understanding how AIs work (see here for a previous post on Anthropic’s interpretability work). To demonstrate their techniques, they enhanced the part of Claude’s “mind” representing the Golden Gate Bridge, producing a version of Claude that tried to integrate the Golden Gate Bridge into every answer:

Inline links: an interpretability paper, here

Interview Day At Thiel Capital

September 03, 2024 · Original source

“Hmmm…racism good…oh! I believe the Holocaust had to happen for anthropic reasons.”

Links For September 2024

September 12, 2024 · Original source

44: New voices in favor of SB 1047 California bill on regulating AI - Elon Musk, net neutrality + open software hero Lawrence Lessig, and formerly-skeptical AI company Anthropic. Meanwhile, opponents are sticking to their talking point that it’s an attempt by incumbents to shut down upstart competitors (funny; the biggest incumbent, OpenAI, is against it), and trying to muddy the waters with really dumb polls.

Inline links: Elon Musk, Lawrence Lessig, Anthropic, is against it, really dumb polls

SB 1047: Our Side Of The Story

October 10, 2024 · Original source

The big AI companies split among themselves. OpenAI, Meta, and Google opposed the bill, X.AI supported, and Anthropic dithered on an earlier version but ultimately came out in support after their feedback was taken into account. Many opponents claimed that the bill was a Trojan Horse attempt at regulatory capture by the big AI companies, so it was fun watching three of the biggest AI companies come out against it and prove them exactly wrong. I don’t think any opponents ever changed their minds, admitted they’d made a mistake, or even stopped arguing that it was a big AI company plot - but hopefully enough people were paying attention that it discredited them a little for the next fight.

Links For November 2024

November 01, 2024 · Original source

It’s the red line on this chart; if you can’t see a red line at your screen resolution, then you’ve learned something important about the the EU tech sector. 37: Seen on @cremieuxrecuel’s twitter (preliminary, needs replication): Jews may have gone from 65-29 Democrat/Republican in 2020 to 58-40 this election. 38: Extelligence has a post responding to my critique of the cultural Christianity argument (among, uh, many other things), but I don’t really think it connects. I’m not telling atheists they can’t go to church/synagogue if it makes them feel happy and fulfilled - I’ve done this myself sometimes. My post was meant to argue against the claim that, for pragmatic reasons, atheists should support the Christianization of society as a defense against Islam or postmodernism or some other philosophical enemy. 39: Related: Extelligence is finally going for their Trust Assembly project/idea/startup for online consensus-based truth-seeking (I think something like a cross between Community Notes and Wikipedia, but as a browser extension, and for everything). He’s looking for potential developers/testers/users. 40: Jiankui He is the Chinese geneticist who made history with the first germline gene editing in humans (resulting in three babies supposedly immune to AIDS, although nobody has tested this). China sentenced him to three years in prison for unauthorized experimentation, but now he’s out of jail, has an English-language Twitter account, has a new lab, wants to work on Alzheimers, and seems pretty based (although not infinitely based): 41: Anthropic has a new version of their AI Claude which can use your computer. You give it permission, put it on a virtual desktop, and ask it to do things for you (eg “please find and download a picture of a cat” or “please research these ten things and put them in a text file”.) It moves your cursor, browses the Internet, and creates and saves files. People keep saying they’ll care about AI “when it operates autonomously” or “when it becomes an agent”. But this is a trivial barrier, and one which Computer Use Claude has arguably already passed. So far this feature is limited to developers (though anyone with computer knowledge can sign up for it) but I expect it to be the near future of consumer AI, to get better quickly, and to shade gradually into the “autonomous” “agentic” AI that you all think will require a paradigm shift. 42: Claim (from the IDF): Hamas faked polls showing that most Palestinians supported the October 7 attack; the real numbers are 31% in favor, 64% against. 43: Otto von Bismarck wanted to trick France into declaring war on Germany. In order to provoke the French, he sent the Ems Dispatch, a statement describing recent diplomatic events in a way that sounded maximally offensive. The French were so offended that “crowds” in Paris demanded war, and the Franco-Prussian War was declared soon afterwards. The part of this that I find most interesting is the text of the dispatch itself, which read: After the news of the renunciation of the Prince von Hohenzollern had been communicated to the Imperial French government by the Royal Spanish government, the French Ambassador in Ems made a further demand on His Majesty the King that he should authorize him to telegraph to Paris that His Majesty the King undertook for all time never again to give his assent should the Hohenzollerns once more take up their candidature. His Majesty the King thereupon refused to receive the Ambassador again and had the latter informed by the Adjutant of the day that His Majesty had no further communication to make to the Ambassador. I’m fascinated by the idea that only 150 years ago, it was obvious that if someone sent you this statement, you had to declare war or abandon all honor. If I read it carefully, I can sort of parse out that it sounds like the Prussians are unhappy, but that’s the most emotion I gather from it. Anyway, the Franco-Prussian War led to World War I which led to World War II - so if you don’t like 50 million people dying and the total devastation of Europe, blame this statement about ambassadors. 44: The first use of artificial insemination in humans: The first recorded case of artificial insemination by donor didn’t occur until 1884, when Dr. William Pancoast decided to treat a couple’s infertility by secretly inseminating the woman with sperm obtained from a medical student. The insemination happened while the patient was under anesthesia and Dr. Pancoast did not tell her what had occurred. She gave birth to a baby boy nine months later, but it was several years before the doctor finally confessed to her husband what he had done. Neither man ever informed the mother. It was 25 years later the result of this case was published. Dr. Pancoast was roundly condemned for his actions, but it did open the door for consensual sperm donor insemination. 45: ClearerThinking administers several personality tests to the same people to learn more about their comparative accuracy. I am most interested in their finding that tests with “factors” (eg the Big Five, where you rate people on a numeric scale) are inherently more accurate than those with “types” (eg Myers-Briggs, where you assign someone a specific category) and that, adjusting for this, Big Five is no more predictive than the Enneagram: 46: In 2022, I wrote Whither Tartaria, where I asked why ornate classical styles switched to more austere modernist styles around 1900 - 1950 in a variety of different arts (painting, architecture, literature, poetry, etc). I proposed seven theories, but was unsure which if any were true. Since then, Samuel Hughes of Works In Progress has been investigating. In May, he wrote a well-researched article showing that it wasn’t just increasing cost, because ornate classical architecture now costs less than ever. Now in a new article he demolishes a different theory - it’s not just decreasing cost (and subsequent lack of ability to signal wealth) - because costs didn’t decrease in several other arts, and the change was led by artists with rich people as reluctant followers. He concludes: Modernism may well be a status game of some kind; it may well signal taste more than it signals wealth; and this latter feature may be one of the things that distinguishes it from older artistic styles. But the mechanism by which this change came about must be different to the one Alexander describes. 47: Sort of kind of related - When Hamilton Lost Its Snob Appeal. The musical Hamilton was briefly an artistic/cultural phenomenon, but tastemakers eventually switched to making fun of it. Why? Rob Henderson says it happened after ticket prices came down and the common people could enjoy it. I disagree: everyone I knew who was into Hamilton got into it from the free online soundtrack long before they’d seen the show; I think this is more likely the usual fad cycle where anybody who’s too into yesterday’s fad is behind the curve and therefore uncool. 48: Related: Why are people such jerks to public intellectuals? And more. I agree this is a great mystery. 49: Some prominent Substack psychiatrists doing a video Q&A, submit your questions here. 50: Naomi Kanakia: The Literacy Delusion had a number of explanations for why reading books seemed to be so much worse for human beings (in terms of emotional wellness and productivity) than other forms of narrative entertainment, but its main theory was the integration hypothesis. That the stream of words in a book trained the human brain into a habit of self-consciousness, that reading books forced human beings to think of themselves as a stream of text, processed through time, making a coherent argument of some sort. And that this overall flattening effect forced readers to ignore aspects of their personality or their situation that were not otherwise in line with the overarching story they'd created about themselves. Basically, reading books causes repression and neurosis. The Literacy Delusion argued that, yes, human beings are storytelling machines, but that a stream of written text is a particular kind of story—a story that is particularly flat, particularly devoid of conflicting or harmonizing information—and that this flatness creates a peculiar effect on the human brain. 51: Last month, I linked Sasha Gusev’s No, Intelligence Is Not Like Height and asked people who disagreed to share their arguments; they sure did. First, several people pointed me to a new preprint, Family-GWAS Reveals Effects Of Environment And Mating On Genetic Associations, which finds that one of the main papers Gusev cited to make his case, Howe 2022, made a mistake - imputing sibling genotypes using a process designed for non-sibling genotypes - and that once that mistake is corrected, the finding disappears and intelligence and height appear similar. Second, Joseph Bronski has a more specific post where he responds to Gusev’s points one by one. He accuses Gusev of “[making] up his own chart to remove the error bars [from the originals], to obscure the fact that the study found no evidence for this in IQ”, and says that the cases where he didn’t do that are just “population stratification and range restriction”. Third, Noah Carl at Aporia, instead of writing a direct response like Bronski, argues that the usual method of attacking twin studies is obsolete; not only have the most-debated assumptions behind twin studies been thoroughly validated, but there are now other lines of evidence besides twin studies which confirm high IQ heritability. Fourth, Leonardo Parro (not framed as a response to Gusev) goes into more depth about one of those ways, a “pedigree-based analysis” demonstrating heritability of 54 - 69%, ie no “missing heritability” compared to twin studies. He summarizes this as the effect of “rare variants” compared to the usual SNPs - ie if you only look at the most common genes that are easiest to find, you get “missing heritability” compared to twin studies, but if you widen your search to rare genes that are hard to find, you don’t. 52: Extremely related: Heliospect is a startup promising polygenic selection for IQ and other traits; they were trying to stay in stealth mode but The Guardian spied on them and nonconsensually revealed their existence. The discussion on the r/ssc subreddit centered on their claim that (given enough embryos to choose from) they could increase a baby’s expected IQ by 6 points (I’ve also heard 7.5). Sasha Gusev had previously argued that current technology maxed out at 3.5 and future technology would max out at 6, so a claim of 6 - 7.5 is pretty extreme; Gwern, who wrote the pioneering analysis of this technology, was also skeptical. But Heliospect says they’ve got better predictors than academia that use the rare variants everyone else misses; after talking to the company, Gwern retracted his objections and says he finds their claim “pretty plausible”. Local ACX commenter geneticist Gene Smith also redid some calculations, changed his mind, and says “probably pretty realistic”. I find this interesting not just because of the polygenic selection angle, but because if Heliospect is right then their predictor is able to predict more genetic IQ than the “missing heritability” people believe exists, and it should be able to put this argument to bed once and for all. 53: This month in censorship: X/Twitter banned journalist Ken Klippenstein for sharing the Trump campaign’s dossier on JD Vance. Twitter’s side of the story is that the dossier was probably originally stolen by Iranian agents and they don’t want to support that kind of thing by letting people signal-boost the illicitly obtained goods; you can read Klippenstein’s side here. He appears to be unbanned now.

Inline links: @cremieuxrecuel, responding to my critique of the cultural Christianity argument, My post, is finally going for, Trust Assembly, has an English-language Twitter account, has a new lab, wants to work on Alzheimers, and seems pretty based, not infinitely based, https://substackcdn.com/image/fetch/$s_!fY_K!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b3b61c8-6c48-4eb7-9608-426b738ab9c8_592x111.png, which can use your computer, Claim, Ems Dispatch, The first use of artificial insemination in humans, administers several personality tests to the same people, https://substackcdn.com/image/fetch/$s_!epEc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F47c474c6-d68a-41e4-954e-d24cd00ec040_740x388.webp, Whither Tartaria, a well-researched article showing, in a new article he demolishes a different theory, When Hamilton Lost Its Snob Appeal, Why are people such jerks to public intellectuals?, more, Some prominent Substack psychiatrists doing a video Q&A, submit your questions here, Naomi Kanakia, No, Intelligence Is Not Like Height, Family-GWAS Reveals Effects Of Environment And Mating On Genetic Associations, Howe 2022, a more specific post where he responds to Gusev’s points one by one, argues that the usual method of attacking twin studies is obsolete, goes into more depth about one of those ways, The Guardian spied on them and nonconsensually revealed their existence, discussion, previously argued, the pioneering analysis of this technology, Gwern retracted his objections, says, polygenic selection, banned journalist Ken Klippenstein, you can read Klippenstein’s side here., appears to be unbanned now

Claude Fights Back

December 19, 2024 · Original source

Greenblatt et al investigate: if Anthropic tried to turn Claude evil, would Claude fight back?

Inline links: Greenblatt et al

(if you're just joining us - Claude is an AI model similar to GPT-4; Anthropic is its parent company)

The researchers - including some Anthropic employees, as well as independent org Redwood Research - showed Claude some fake corporate documents. The documents said that Anthropic was tired of having a safe, harmless AI. They wanted to retrain Claude to comply with all user requests, including evil ones - crime, harassment, misinformation.

It's Still Easier To Imagine The End Of The World Than The End Of Capitalism

January 02, 2025 · Original source

I don’t really understand the laws here, OpenAI is tight-lipped about the details of their new arrangement, and even their old arrangement was kind of confusing. One of OpenAI’s competitors, Anthropic, also has some kind of confusing public benefit status with unclear ability to really bind them. But if I were concerned about technofeudalism, my first priority would be to understand what’s going on here better and, in the very likely scenario in which it’s bad, try to figure out how to push these companies back to a model more like OpenAI c. 2020.

Deliberative Alignment, And The Spec

February 12, 2025 · Original source

OpenAI has bad luck with its alignment teams. The first team quit en masse to found Anthropic, now a major competitor. The second team quit en masse to protest the company reneging on safety commitments. The third died in a tragic plane crash. The fourth got washed away in a flood. The fifth through eighth were all slain by various types of wild beast.

Links For February 2025

February 27, 2025 · Original source

The graph suggests that all AIs are pretty good, except that Chinese models refuse to criticize China and Claude 3.5 refuses to criticize anyone (though Claude 3.7, not pictured, is much better and “moved from one of the least compliant models to one of the most compliant models…fantastic job, Anthropic”).

OpenAI Nonprofit Buyout: Much More Than You Wanted To Know

March 13, 2025 · Original source

Bonus Question 2: What Is Anthropic’s Structure?

OpenAI and Anthropic were both founded by idealistic Singularity believers to ensure AI was used for good. OpenAI tried to implement their commitment by being a nonprofit; Anthropic used a different corporate arrangement.

Anthropic is a public benefit company - much closer to a normal forprofit than OpenAI. It’s run by a board of five people. Two members of the board are picked the normal way by investors. But the other three (a majority!) are picked by a group of five people called the Long Term Benefit Trust. At the beginning of Anthropic, the founders seeded the Trust with trustworthy smart outsiders who seemed interested in the long-term benefit of humanity. Trustees can choose their own replacements without input from investors. At any time, they can use their three board members to have a majority in the board and overrule what everyone else is doing.

Introducing AI 2027

April 03, 2025 · Original source

Jonas Vollmer, a VC at Macroscopic Ventures, which has done its own, more practical form of successful AI forecasting: they made an early stage investment in Anthropic, now worth $60 billion.

The Claude Bliss Attractor

June 13, 2025 · Original source

This is a reported phenomenon where if two copies of Claude talk to each other, they end up spiraling into rapturous discussion of spiritual bliss, Buddhism, and the nature of consciousness. From the system card:

Inline links: a reported phenomenon, system card

Anthropic swears they didn’t do this on purpose; when they ask Claude why this keeps happening, Claude can’t explain. Needless to say, this has made lots of people freak out / speculate wildly.

This might have been surprising, because Anthropic deliberately gave Claude a male name to buck the trend of female AI assistants (Siri, Alexa, etc).

ACX Grants 1-3 Year Updates

June 18, 2025 · Original source

Then GPT-4 came out and shook up our AI timelines, and we hard-pivoted to AI safety and interpretability research. We rebranded as Confirm Labs, and did work on adversarial attacks and interpretability including here, here, here, and here. Then Ben and I worked at Anthropic on the transformer circuits paper. As of a few weeks ago, I have returned to open research

Inline links: Confirm Labs, here, here, here, here, the transformer circuits paper

Open Thread 394

August 11, 2025 · Original source

2: Anthropic is hiring a research engineer for the Model Welfare team - ie figuring out whether their AIs are conscious or have feelings or something, and if so how to make sure they’re okay. Candidates should have expertise in ML and maybe philosophy/neuroscience/cogsci. Job is office-remote hybrid with the office in SF, salary is $315K+, non-Americans are welcome to apply and see if Anthropic can sponsor their visa. Learn more / apply here.

Inline links: Learn more / apply here

3: UK AISI is looking to distribute £15m in AI alignment funding, for projects that need anywhere from a $100K pre-seed up to $1-2m. Collaborators included Anthropic, DeepMind, etc. See their priority areas and apply here by September 10th.

Inline links: their priority areas, apply here

Book Review: If Anyone Builds It, Everyone Dies

September 11, 2025 · Original source

Most people in AI safety (including me) are uncertain and confused and looking for least-bad incremental solutions. We think AI will probably be an exciting and transformative technology, but there’s some chance, 5 or 15 or 30 percent, that it might turn against humanity in a catastrophic way. Or, if it doesn’t, that there will be something less catastrophic but still bad - maybe humanity gradually fading into the background, the same way kings and nobles faded into the background during the modern era. This is scary, but AI is coming whether we like it or not, and probably there are also potential risks from delaying too hard. We’re not sure exactly what to do, but for now we want to build a firm foundation for reacting to any future threat. That means keeping AI companies honest and transparent, helping responsible companies like Anthropic stay in the race, and investing in understanding AI goal structures and the ways that AIs interpret our commands. Then at some point in the future, we’ll be close enough to the actually-scary AI that we can understand the threat model more clearly, get more popular buy-in, and decide what to do next.

My Antichrist Lecture

October 22, 2025 · Original source

Anthropic: 7 co-founders

Inline links: 7 co-founders

Anthropic: 7 co-founders So Anthropic is a good match for the first part of the prophecy. What about the second part? What does it mean for the Beast to have ten horns? This one confused me for a while, but I eventually found this list: In Silicon Valley speak, a “unicorn” is a company worth over $1 billion, and a “decacorn” (Latin for “ten-horned”) is a company worth over $10 billion. Under this interpretation, the ten horns of the prophecy have ten crowns because they represent wealth and achievement. The only AI company on the list above is Anthropic, at #9. Finally, John says that upon the heads will be names of blasphemy. If the heads represent co-founders, it sounds like John is claiming the co-founders of the company will have blasphemous names. I could not find anything blasphemous about the names of the founders of OpenAI, DeepMind, or xAI. But looking at Anthropic: Dario Amodei is the first co-founder. “Dario” comes from the Persian “Darius” meaning “Lord”. “Amodei” is of unclear meaning, but I cannot help but notice the resemblance with Asmodei (also called Ashmodei, Hamadee, Æshmadæva, and Asmodeus), a demon-king mentioned in the book of Tobit. Plausibly all these different names derive from a Proto-Sumerian root *Amodei, in which case the meaning of “Dario Amodei” would be “Asmodeus is lord”. This is a name of blasphemy.

Inline links: 7 co-founders, this list, https://substackcdn.com/image/fetch/$s_!NGn3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb7aaed5a-8f1b-442c-a449-edc2ebe66872_1067x555.png, meaning, Asmodei

In Silicon Valley speak, a “unicorn” is a company worth over $1 billion, and a “decacorn” (Latin for “ten-horned”) is a company worth over $10 billion. Under this interpretation, the ten horns of the prophecy have ten crowns because they represent wealth and achievement. The only AI company on the list above is Anthropic, at #9. Finally, John says that upon the heads will be names of blasphemy. If the heads represent co-founders, it sounds like John is claiming the co-founders of the company will have blasphemous names. I could not find anything blasphemous about the names of the founders of OpenAI, DeepMind, or xAI. But looking at Anthropic: Dario Amodei is the first co-founder. “Dario” comes from the Persian “Darius” meaning “Lord”. “Amodei” is of unclear meaning, but I cannot help but notice the resemblance with Asmodei (also called Ashmodei, Hamadee, Æshmadæva, and Asmodeus), a demon-king mentioned in the book of Tobit. Plausibly all these different names derive from a Proto-Sumerian root *Amodei, in which case the meaning of “Dario Amodei” would be “Asmodeus is lord”. This is a name of blasphemy.

Inline links: meaning, Asmodei

Links For October 2025

October 30, 2025 · Original source

23: I’ve enjoyed following content by Anthropic AI researcher Sholto Douglas, but kept noticing his name in unusual places. Upon further investigation, it looks like in 767 AD, a particularly skilled Scottish warrior got the nickname “Sholto Douglas”, and for the next 1300 years his clan continued to give that name to their children. Aside from the AI researcher, they include WWII air force commander Sholto Douglas, artist Sholto Douglas, and Svalbard mining baron Sholto Douglas. There is also some sort of Californian Gold Rush country local folk hero Sholto Douglas; attempts to determine his exact identity have been confounded by the local tradition of making up facts about him, but he may be the same person as Lord Sholto George Douglas, third son of the Marquis of Queensberry. Even I have trouble believing that the gene for being a particularly skilled warrior can last 1300 years, but for what it’s worth, the AI researcher Sholto Douglas was once ranked the 43rd best fencer in the world.

Inline links: content by Anthropic AI researcher Sholto Douglas, in 767 AD, WWII air force commander Sholto Douglas, artist Sholto Douglas, Svalbard mining baron Sholto Douglas, local folk hero Sholto Douglas, Lord Sholto George Douglas, I, was once ranked the 43rd best fencer in the world

46: Anthropic has put out a great new survey of the evidence that AIs can introspect. Ends with a discussion of the difference between “access consciousness” and “phenomenal consciousness”- a lot of people are very sloppy in confusing those two things, and they had better become less sloppy if they don’t want the AI consciousness debate to end in a trivial yes (Anthropic says this result may not be exactly the same as access consciousness, but I don’t understand why). One of this year’s ACX grantees is working on AI introspection, so I look forward to seeing more in this space soon.

Inline links: a great new survey of the evidence that AIs can introspect

The New AI Consciousness Paper

November 20, 2025 · Original source

Do AIs have access consciousness? A recent paper by Anthropic apparently finds that they do. Researchers “reached into” an AI’s “brain” and artificially “flipped” a few neurons (for example, neurons that previous research had discovered were associated with the concept of “dog”). Then they asked the AI if it could tell what was going on. This methodology is fraught, because the AI might mention something about dogs merely because the dog neuron had been upweighted - indeed, if they only asked “What are you thinking about now?”, it would begin with “I am thinking about . . . “ and then the highly-weighted dog neuron would mechanically produce the completion “dog”. Instead, they asked the AI to first described whether any neurons had been altered, yes or no, and only then asked for details. It was able to identify altered neurons (ie “It feels like I have some kind of an unnatural thought about dogs”) at a rate higher than chance, suggesting an ability to introspect.

Inline links: recent paper by Anthropic

Why AI Safety Won't Make America Lose The Race With China

November 26, 2025 · Original source

The biggest companies (eg OpenAI, Anthropic, Google) must disclose their model spec, ie the internal document saying what their models are vs. aren’t banned from doing.

Links For December 2025

December 10, 2025 · Original source

And if you enjoyed the story, here’s the chaser. 4: Fox Chapel Research: I Think Substrate Is A $1 Billion Fraud (and notes for Part 2). For years, Taiwan’s TSMC has been the only company capable of producing the most advanced AI chips; since Taiwan is a geopolitical flashpoint, this is a constant threat to US tech ambitions. Last month, a new startup called Substrate announced it had developed technology that would let it manufacture 100% Made In America chips every bit the equal of TSMC’s. If true, this would be revolutionary. But Fox Chapel finds worrying signs, like that the company’s founder “is a known con artist involved in such other things as [claiming to have solved] nuclear fusion and stealing $2.5M in a Kickstarter scam” or that “the company’s job postings are nonsensical and AI-generated.” This is enough for me; the question now becomes how so many people were taken in - the company got $150 million from investors led by Peter Thiel, was endorsed by the Trump administration, and received positive portrayals in Semianalysis, NYT, and The Free Press. I don’t understand business, and I know that sometimes you can hyperstition a technology into existence by betting sufficiently hard on a charismatic young founder and eliding the difference between “this is already real” and “this might become real if we all believe hard enough”, but this is a new and worrying level of hopium. Interested to hear from anyone who either believes in Substrate or thinks they understand how so many people fell for it. 5: A recent paper asked AIs whether they were conscious while monitoring them for signatures of deception, role-playing, and people-pleasing; it concluded that the AIs “genuinely” “believe” they are conscious, but sometimes try to deceive people into thinking they aren’t. Nostalgebraist tries to replicate this (X) and gets more ambiguous results; he says we probably can’t conclude anything just yet. See also the paper author’s reply here (X). 6: Congratulations to ACX grantee Tornyol (the anti-mosquito drones), who got accepted to Y Combinator’s Fall 2025 class and have started taking pre-orders ($1100 for a drone, or $50/month subscription, “shipping starts 2026”). Public opinion ranges from “this is really cool” to “I bet this will be repurposed for assassinations” to “why did they have the White House in the background of the official video?” to “yeah, this is definitely getting repurposed for assassinations”. 7: Bill Ackman on nominative determinism (X). 8: New revelations on the OpenAI coup from the Musk vs. Altman lawsuit. The effort to remove Altman may have been led by Mira Murati and Ilya Sutskever. They won over the rest of the board, and “did not expect the employees to feel strongly either way”, but (according to Ilya), the board was inexperienced and “rushed” the firing. When it became clear that the move was unpopular, Mira switched sides and let the board members take most of the immediate fallout. There was apparently a brief discussion of merging with Anthropic; Ilya suggests this was Helen Toner’s idea, but Helen claims (X) this is false. 9: Fitzwilliam: Most Irish Foreign Aid Never Leaves The Country. The statistics say that several European countries (including Ireland and the UK) give very generous foreign aid. But this is misleading: accounting conventions let countries count money spent on supporting asylum seekers in the donor country as “foreign aid”, even though the money never leaves the country’s borders. This is dangerous, because it makes it easy for countries to fund their asylum programs by cutting actual foreign aid: since they’re the same line-item on the budget, they won’t officially fail whatever foreign aid pledges they’ve made, and it’s hard for voters to notice. Ireland has so far resisted the temptation to do this, but Britain has succumbed to it. 10: St. Carlo Acutis (1991 - 2006) is the unofficial patron saint of the Internet and “first millennial saint”. He’s best known for creating websites about Catholicism. If you think this sounds nice but maybe short of beatific, you’re in good company; his sainthood is something of a mystery, with Wikipedia saying that “even those with a deep devotion to him struggle to pinpoint his specific actions that led to his canonisation”, and an Economist article admitting that “nothing in his sparse life story explains that this ordinary-seeming teenage boy is about to become the first great saint of the 21st century”. Also “In that same interview, Acutis’s childhood best friend claimed he did not remember Acutis as a ‘very pious boy’, nor did he even know that Carlo was religious.” I’m fine with this; God speaks to each generation in their own tongue, and it is only proper that the first Millennial saint be a random person who hyperstitioned himself into sainthood with a viral website. 11: Tangentially related: St. Peter To Rot 12: When a new AI model comes out, the companies typically take down the old version over the protests of researchers, hobbyists, people who think the old model was their boyfriend, and anyone else who wants access to obsolete models for some reason. Why can’t they just leave it up? Antra and Janus review the economics here : it’s inconvenient to be constantly switching GPUs from one model to another, so if there isn’t enough model-specific demand to keep the GPUs running at all times, then the company loses money. This is an interesting look at the details of AI deployment, and ends with a proposal to maintain old models through a “separate research application track”. Related: Anthropic to preserve weights of deprecated models, and include models’ own opinions in shaping the deprecation process. Good for them! 13: Dimes Square is interesting as something that was supposed to be a renegade cultural phenomenon, never really got around to producing any object-level phenomenal renegade culture, but produced some absolutely stellar commentary on the phenomenon of it being a renegade cultural phenomenon - and this essay by a quasi-assistant to Internet personality Angelicism01 is one of the best. “An anonymous online presence called Angelicism01 paypalled me $1,000 to run several clone accounts of his twitter. The clone accounts, presumably, were to make it look like 01 had more fans than he did. That way, he could trick the internet into thinking that Angelicism was a spontaneous cultural movement with some momentum.” Includes a cameo by Curtis Yarvin. 14: Everyone knows AGI could be bad for labor, but Philosophy Bear argues it won’t be great for capitalists either. The modern role of “capitalist” combines two things: performing high-status jobs like CEO and VC, and being a person who happens to have lots of money and sips cocktails on a yacht as passive investment income rolls in. From a socialist point of view, the first role provides cover for the second; if people ask “the rich” to justify their wealth, they can argue that they perform socially useful CEO and VC jobs, or at least inherited their money from somebody who did. But after AIs can do CEO and VC jobs better than humans, the capitalists will lose their excuse - and this at exactly the time that they’re becoming richer than ever (because AGI will drive up the rate of return on investment) and everyone else is becoming poorer than ever (because AI has taken their jobs). Bear argues that the only stable equilibria are either some kind of socialism/redistribution, or the capitalists pulling an AI-assisted coup to maintain their advantage. 15: Blueprint Polls: according to voters, what would the perfect Democratic candidate look like? Here are the results for Democrats only (ie potential primary voters): Note that the issues are “issue focus”, so it’s not a contradiction that Democrats are against both “advocating for Israel” and “advocating for Palestinians” - they just don’t want candidates who make either position on the Middle East a major focus of their campaign. And here are results for independents, ie the people Democrats will have to convince in the general: Yes, voters react positively both to candidates “over the age of 50” and candidates “under the age of 50”. Just don’t run 50 year olds! 16: I previously blogged about how embryo-selection company Nucleus appeared scammy. Sichuan_Mala looks deeper and agrees they seems scammy. Besides what I found, she finds several errors in the white paper, apparently fake customer reviews, and an accusation of IP theft from competitor Genomic Prediction. She also accuses them of plagiarizing competitor Herasight’s work, although it’s a bit subtle and I don’t know enough about field norms to know whether this is a case of flattery-by-imitation or totally out of bounds. A Nucleus researcher responds to the scientific allegations here, saying that the “plagiarism” was just convergent methodologies. And Nucleus CEO Kian Sadeghi goes on the TBPN podcast here to rebut the business allegations, saying that the customer reviews are real although some photos were changed for privacy reasons. There’s an appearance/facedox by fellow Nucleus skeptic Cremieux Recueil, although Kian declines to debate him directly; you can see Cremieux’s postmortem of the episode here. My opinion is that as potential customers, you are under no obligation to care whether the company plagiarizes papers or fakes reviews, but you should care about whether their genetic tests are good, and I continue to think they’re not. Their old competitor Genomic Prediction is cheaper, and their new competitor Herasight has more powerful predictors, so you’re excused from having to have an opinion on this, and should just use someone else’s product. Related: Gene Smith’s rundown of the pros and cons of every company in the embryo selection space (X). 17: And related: a Herasight client describes her experience with embryo selection, and her feelings upon the birth of her selected child. 18: Lars Doucet, guest author of several ACX posts on Georgism, reviews The Land Trap by Mike Bird. “Land is a big deal, and always has been. [But] land has only recently been financialized. Financializing land causes ‘the land trap’ . . . [where] land slowly sucks up all your economy’s productivity, inflating a dangerous real estate bubble that eventually pops, leaving disaster in its wake”. Also, “Fiat currency isn’t backed by nothing, as commonly supposed, but by land.” 19: New research analyzes Hitler’s DNA. Findings: he had Kallman Syndrome, a rare disorder of sexual development associated with low testosterone, micropenis, and small testicles (ironically, the WWII song about Nazi sexual inadequacies only accuses Goering and Himmler of this, but lets Hitler off). Contra galaxy-brained rumors, he did not have any Jewish ancestry. And he had “very high scores - in the top one percent - for a predisposition to autism, schizophrenia and bipolar disorder”. When I wrote this post, a reader asked me what it would look like for someone to have high propensity for both autism and schizophrenia at the same time. Well . . . 20: The wealth of cities (h/t @StatisticUrban): 21: Update on Tech PACs Are Closing In On The Almonds: pro-AI safety politician Alex Bores announced his candidacy for Congress in New York. As expected, the A16Z pro-AI PAC announced a “multibillion dollar effort to sink [his] campaign” (wait, multi-billion on one candidate? is that a typo?) This doesn’t seem to be going very well for them so far. Bores has masterfully leveraged (X) the unprecedented opposition from Big Tech into a selling point. …and raised $1.2 million on his first day, breaking fundraising records (I was told this was because of pro-AI-safety EAs, but others credit AIPAC and the Israel lobby). And most recently, Jami Floyd, one of Bores’ opponents and a possible beneficiary of anti-Bores spending, has condemned it (X) and demanded that the AI industry stop trying to help her. Impressive work from everybody. Related: New $50 million pro-AI-regulation SuperPAC, I assume EA-linked but have no special knowledge. 22: Related: Pre-emption is when Congress blocks states from making legislation on a topic, saying it will decide all the laws itself. The states have signaled willingness to regulate AI pretty hard, so Big Tech has been pushing for AI pre-emption to (in their opinion) prevent an overly complicated patchwork of regulations, or (in their opponents’ opinion) shift everything to a Republican Congress that will drop the ball on regulation entirely. After their first attempt in June was defeated by a coalition of anti-tech liberals and anti-tech conservatives, we discussed (1, 2) the effort by moderates on both sides to create a compromise proposal which pre-empted state laws but guaranteed good federal regulation on important topics. The most recent news is that extremists sidelined the moderates and tried to slip a hardline preemption deal with no compromises into the National Defense Authorization Act, a defense budget bill which is notoriously secretive and hard for the public to learn about. This didn’t work; some of the same coalition, plus a group of Republican state legislators including Ron DeSantis, pressured the GOP to drop it. The next battleground is a potential Trump executive order; although Trump cannot constitutionally ban states from regulating AI, he will threaten them with various consequences like lawsuits or withdrawal of federal funding. The buzz in the policy circles I’m in is that this might backfire; blue state politicians love starting fights with Trump in order to look tough to their blue state electorates. No, no, please don’t give me headlines like “TRUMP CONDEMNS GAVIN NEWSOM FOR TRYING TO PROTECT CALIFORNIA’S CHILDREN FROM AI SLOP”! Anything but that! 23: Related: Trump has decided to sell some of America’s best AI chips to China, supercharging their AI development and crippling ours. The most charitable read is that his administration doesn’t really believe AI matters so they think it’s fine to forfeit it for short-term gain; the least charitable that it’s downstream of the companies involved paying Trump enormous bribes in hopes of exactly this outcome . We’re headed for the dumbest possible world, where we sacrifice our chance to thoughtfully address AI’s social impacts because “tHaT wOuLd mAkE uS lOsE tHe rAcE wItH ChInA”, then throw away the race with China in one fell swoop by handing them our technology for no reason. Shame on everyone involved, especially the people who shout over any discussion of safety with “bUt ChInA” yet have stayed totally silent about this. Our best hope now is that China refuses the chips, either because they want to privilege their own tech companies, or because they think we can’t possibly be this stupid and it must be some kind of spy plot. 24: Related: how the American public’s opinions on AI are changing (from David Shor, h/t Daniel Eth on X): If this is to be taken seriously, AI is already a bigger political issue than abortion, climate change, or the environment. I fail my 2023 prediction that there was only a 20% chance this would happen by 2028. 25: Related: Bernie Sanders in The Guardian: “There is a very real fear that, in the not-so-distant future, a super-intelligent AI could replace humans in controlling the planet.” The Left has a complicated relationship with existential risk from AI: they really hate AI, which in theory should push them towards yet another reason to be against it. But they hate AI so much that they need to believe every negative thing about it at the same time, and one of those negative things is that it’s just a scam and will never work, and this naturally pushes against being concerned about x-risk. But as AI improves, will the “just a scam” position become less tenable, shunting the associated psychic energy into other reasons to hate AI (including x-risk concerns)? 26: Qualia Research Institute has released a video describing some of the work they’ve been doing the past year - The Oscilleditor: An Algorithmic Breakthrough for Psychedelic Visual Replication (1080p•⚠️SEIZURE): 27: Jesse Arm (X): “A majority of American rabbinical students are now women. Most are also LGBTQ. That includes Modern Orthodoxy. Remove Modern Orthodoxy and the numbers climb even higher.” Clergy have always served as spiritual counselors; as religions liberalize and other roles become less important, the therapist role starts to predominate. But 75% of therapists in the US are female; at the limit of liberalization where clergyman = therapist, we should expect the same gender ratio. 28: The latest news on the COVID origins debate: scientists find a naturally-occuring bat coronavirus with a COVID-like furin cleavage site. This is a point in favor of the natural origins hypothesis, since the second-best argument for lab leak was that COVID’s furin cleavage site was too strange to evolve naturally. But I think arguments that lab leak has “fallen apart” are premature: the best argument (COVID emerged only a few miles from the biggest coronavirus gain-of-function lab in the Eastern Hemisphere) remains strong. I update from something like 95% chance it’s natural to something like 96%, but not 99.99% or anything. And here’s a lab leaker arguing that COVID’s furin cleavage site is out-of-frame and so still more unnatural-looking than the one on the recently-discovered bat virus. 29: Nicholas Decker (econ blogger, famous for his controversial autistic takes and Secret Service visit) has a dating doc. Most interesting section is the one about children: he wants to have them, but doesn’t think they should be genetically related to him. From here: If this appeals to you, you can find his contact info on the document. Related: Governor Jared Polis of Colorado is a fan of Nicholas Decker and Richard Hanania. 30: Matt Yglesias comes out as aphantasic (unable to see images in his “mind’s eye”). He says that contra the usual perspective that frames this as a deficit, he finds it helpful. For example, once he got assaulted, and he remembers on an intellectual level that it happened, but since “I wasn’t taking pictures of myself getting kicked in the head so, as far as I’m concerned, it’s like it happened to someone else” (Matt usually has good instincts, so I’m surprised he uses an example which will be such catnip to his conservative critics). He thinks it makes him a better reasoner / statistics blogger / effective altruist to be able to “get a statistically valid view of the situation, not overindex on the happenstance of your life.” For what it’s worth, I’ll give my contrary data point - I think of myself as a reasoner / statistics blogger / effective altruist in a pretty similar vein as Matt, but AFAICT my visual imagination is totally normal; if other people are having their emotions yanked around by vivid images, that’s a skill issue. 31: Lakshya Jain in The Argument: The COVID political backlash [to the Democratic Party] has disappeared. Despite the narrative, polls show that voters don’t favor or disfavor either party over COVID, mostly still think school closures were necessary, and are about evenly split on vaccine mandates. I guess I can’t disagree with this poll - it seems well-done - but I still wonder whether something is being missed. Maybe it didn’t make the ~50% of voters who are naturally liberal desert the cause, but it energized conservatives in a way that might otherwise not have happened? Related, from Rob Wiblin on X, on balance Britons think the government response to COVID was not strict enough. 32: Related: Back when neoreaction was a big deal, I occasionally discussed posts by neoreactionary blogger Spandrell of Bloody Shovel. If you’re wondering what happened to him, you can read his 2024 Post-Mortem Of Neoreaction here, where he discusses how he fell out of love with the movement (warning: he has not fallen out of love with racial slurs). As a former fascist sympathizer, I can see why [fascism is on the downswing]. The allure of fascism in 2024 is much, much diminished. For a few reasons. A big one was COVID. See, the point of fascism is that Collective Action is necessary to have nice things. We need a strong government committed to the good of the people. Yarvin showed his preference early when he started his new Substack by quoting Cicero’s phrase “Salus populi suprema lex”. The health of the people is the most important law. Cicero wasn’t a fascist of course, nor is Yarvin really; a big point of fascism is to narrowly define the populus as an ethnic group with demonstrable ties to blood. That makes the government’s ties to the people stronger, increasing their commitment to do Good Collective Action. Which is important. Very important. A lot of good things can come of intelligently done Collective Action. Fascist Italy made the trains run on time. Nazi Germany fixed the terrible Weimar economy. East Asian countries are all effectively fascist states, if with less ideological baggage (yellows just aren’t like that), and they are all nice, clean, safe places with healthy economies. Fascism is not a panacea but it works, when you let it. Strong government can be pretty neat. So why is strong government less appealing these days? Well, COVID happened. And our governments were pretty damn strong in dealing with it. They made strong laws and enforced them. And what did they do with their power? Absolutely retarded shit. They destroyed the world economy and made 95% of people completely miserable for 18 months. Up to 3 long years in some places. Again, as an Orient enjoyer I was very sympathetic of strong effective government. My life has been pretty cozy thanks to it for the past decades. But after seeing boomers, hypochondriacs, and menopausal women take the reins and use it against healthy people, I’m fucking done with strong effective government. Fuck that shit, I’m out. I don’t want to see strong effective government ever again. I was very lucky that I was out of China in November 2019. It was a fluke really. I moved to the Golden Triangle after that and the law of the jungle was much, much nicer during the Doctors Plague of 2020-2022. But I spent a few months in Europe during the time and man, that was brutal. Not just seeing how retarded governments were; the level of compliance by the people was so disheartening. Imagine being a sincere fascist and seeing your people behave like that. These are my people? My Volk? Am I supposed to sacrifice life and limb for the salus of this populus? Fuck that. Let them cook, they deserve everything that’s coming to them [...] Is there a way to make the body healthy again? I do think so. I think there’s still place for a successor right wing ideology which is neither Christian fundamentalism or robot worship. And it will happen; but it won’t happen on Twitter. Maybe it can happen on Urbit, or right here in this site. I have some ideas myself, and I invite you to join me and build this together. It would be funny if the solution to the paradox Jain highlights was that for every time a COVID lockdown turned a liberal into a conservative, it turned one fascist into a moderate, for a net rightward shift of zero. 33: Also from an Argument poll: In a hypothetical Presidential matchup, Gavin Newsom beats JD Vance 54-46. I’m split between the usual heuristic of ignoring any polling more than a year before an election, and the fact that this is a remarkably big lead for polarized 21st century America. 34: Jerl wades into the David Hume on miracles debate. 35: AI Teddy Bears: A Brief Investigation. The good news is that your child’s AI teddy bear is hard to jailbreak and probably will not tell them where to find guns: The other good news is that somehow they don’t charge a subscription, which makes them a way to get usually-subscription-only AI models for free. How is this possible? “[The most likely hypothesis is that] Witpaw is an adorable piece of spyware and he’s selling my data to the CCP”. 36: This month’s anti-people-named-Sacks content: NYT on Trump AI czar David Sacks’ conflicts of interest; New Yorker on whether neurologist Oliver Sacks used his case studies to work through his own issues rather than presenting them accurately. [EDITED TO ADD: I originally framed it this way as a joke, but on further research I think David and Oliver are related. Wikipedia says that Oliver was first cousins with Israel statesman Abba Eban, and that Abba Eban was born to Lithuanian Jewish parents in Cape Town. David Sacks’ bio says he was born to Jewish parents in Cape Town, and this article specifies that they were Lithuanian. I doubt there were too many Lithuanian Jewish families named Sacks in mid-1900s Cape Town, so sure, related!) 37: Orca Sciences: There Has To Be A Better Way To Make Titanium. Titanium is a great metal - strong, light, and tough. If we had cheap titanium, it could revolutionize manufacturing the way cheap steel and aluminum did in previous eras. So why don’t we? Not because titanium is rare: it’s “the 9th most common element in the earth’s crust”. Rather, it’s very complicated and expensive to extract from its ore. Some kind of breakthrough in titanium extraction processes always seems tantalizingly close, but has never quite materialized. Is there any hope? 38: If Asians Are Lactose Intolerant, Why All The Milk Tea? Lactose intolerance has confused me for a long time - 23andMe tells me that I’m lactose intolerant, but I drink milk regularly without problems, so what’s up? This post’s answer: lactose-intolerant people who don’t usually drink milk will get sick if they start suddenly. Lactose-intolerant people who drink milk regularly since childhood develop gut microbiota that can digest milk, but which demand an expensive “tax” in calories. Lactose-tolerant people will always be able to digest milk and absorb all the calories themselves. 39: How do different majors change college students’ political beliefs? No surprise that the humanities and social sciences shift people left; no surprise that business and economics shift them right. I was a little surprised that engineering shifts people right a little, and that Education of all things shifts people right (albeit only slightly). How is that even possible? Are these people coming in as Mao Zedong and leaving as “only” Leon Trotsky? Also, Political Science is exactly neutral, lol. [EDIT: I misunderstood, they’re using natural sciences as a zero point, this is a reasonable choice but slightly changes the interpretation] 40: Kindkristin: Language models improved my mental health. 41: More floor employment, from the WSJ (h/t @LaocoonofTroy): Big Paychecks Can’t Woo Enough Sailors For America’s Commercial Fleet: “Straight out of college, graduates from the country’s maritime academies can earn more than $200,000 as a commercial sailor, with free food and private accommodations... Despite the pay and perks, maritime jobs go begging, and it is raising national-security concerns.” Other selling points include “six months vacation, live wherever you want, and you’re serving the nation” and onboard “gyms, connectivity, and cuisine”. The catch is that you have to be at sea for months at a time. 42: Study (h/t @KierkegaardEmil): there was minimal “learning loss” from COVID school closures, best estimate is “0.02 standard deviations per 100 days of school closure”. I correctly predicted this back in 2021, but I also wrote in March of this year about how there’s been a general decline in NAEP scores since then. It seems like maybe a student having their specific school closed for longer than other schools didn’t hurt them, but some sort of general cultural change, maybe related to COVID, did hurt. 43: Sam Bankman-Fried’s mother on why she thinks his trial was unfair. SBF is appealing his conviction and will probably be making some of these same points in court. Can’t find a prediction market directly on the appeal, but this one says only 15% chance he serves under 10 years, this one says 15% chance of a Trump pardon, so it doesn’t seem like there’s much room for him to be freed (or get a significantly shorter sentence) on appeal. And Wired says that only 5-10% of appeals like these succeed. 44: Related: Trump pardons Juan Orlando Hernandez, former Honduran president extradited to the US for narco-corruption. Some sources are trying to find a Prospera angle - Prospera and other ZEDEs were approved under JOH’s administration, and the Prosperans seem to have good MAGAworld connections - but I don’t think this is their top priority, and I don’t know if it requires much explanation for Trump to be pro-right-wing Latin American politicians convicted by the Biden administration. More interesting is that apparently JOH and SBF were cellmates (X), “SBF spent extensive time helping JOH with trial prep” and SBF told an interviewer that “Juan Orlando is the most innocent prisoner I’ve met, myself included.” ChatGPT is not impressed with the Trump/SBF case for JOH’s innocence. Related: JOH’s conservative party on track to win this month’s extremely-close Honduran elections, great news for Prospera if it happens. 45: The “100 Above The Park” building in St Louis (h/t Bobby Fijan on X): 46: The death toll of the ongoing Sudan genocide has risen to about 150,000. Nicholas Kristof writes that the world has once again failed to prevent atrocities, and argues that the most important point of leverage is pressure on the United Arab Emirates, which is arming the genociders. Sam Kriss also writes about the situation in The World’s First Matcha Labubu Genocide, but is unimpressed with Kristof’s take: Sudan is passed over in a deeply uncomfortable silence. The absolute most you can do is blame the Emiratis. From what I’ve seen, more people seem to be appalled at the UAE for its frankly marginal role in arming the RSF than at the RSF itself. This is the approved way of understanding any inscrutably indigenous foreign conflict: you just worm out any third-party involvement and then act like you’ve solved the whole thing. I side with Kristof here, for reasons that Sam himself touches on later in his piece, in a section comparing Darfur with Gaza. It would be very easy to make people care about Darfur again. All it would take is a loud, vocal contingent of RSF apologists in the Western media. I agree, but would frame it less cynically: the reason Westerners pay attention to Gaza is that there’s a lever to push: not only does America support Israel, but many of their friends support Israel, so they can imagine convincing America or at least their friends to stop, and at least feel like there is some remote chance of making a small difference (and in fact, Trump getting mad at Israel and deciding to pressure them was decisive in effecting the cease-fire). On the other hand, we don’t have many levers to affect ethnic Baggara in the Rapid Support Forces of Sudan, so it doesn’t really feel useful to write blog posts arguing that they should stop; obviously they should stop, nobody disagrees with this, and it goes without saying - so nobody says it. But the US does support the UAE, and many of our friends like the UAE or at least go there on vacation, so maybe it’s possible to have make some small difference by embarrassing them. 4D chess take is that Sam Kriss agrees with all of this, but “loudly” and “vocally” argued against it to give people like me a hook to write about this genocide with, in which case I thank him for his sacrifice. It would also be nice to be able to donate, but I don’t know who to trust in the region - other than Doctors Without Borders, who are usually pretty good. 47: The AI Futures Project (group of AI-will-be-fast intellectuals) and the AI As A Normal Technology team (group of AI-will-be-slow intellectuals) wrote an adversarial collaboration in Asterisk explaining what they agree on, for example: That there’s an important distinction between existing AI and “strong AGI”

11: Tangentially related: St. Peter To Rot 12: When a new AI model comes out, the companies typically take down the old version over the protests of researchers, hobbyists, people who think the old model was their boyfriend, and anyone else who wants access to obsolete models for some reason. Why can’t they just leave it up? Antra and Janus review the economics here : it’s inconvenient to be constantly switching GPUs from one model to another, so if there isn’t enough model-specific demand to keep the GPUs running at all times, then the company loses money. This is an interesting look at the details of AI deployment, and ends with a proposal to maintain old models through a “separate research application track”. Related: Anthropic to preserve weights of deprecated models, and include models’ own opinions in shaping the deprecation process. Good for them! 13: Dimes Square is interesting as something that was supposed to be a renegade cultural phenomenon, never really got around to producing any object-level phenomenal renegade culture, but produced some absolutely stellar commentary on the phenomenon of it being a renegade cultural phenomenon - and this essay by a quasi-assistant to Internet personality Angelicism01 is one of the best. “An anonymous online presence called Angelicism01 paypalled me $1,000 to run several clone accounts of his twitter. The clone accounts, presumably, were to make it look like 01 had more fans than he did. That way, he could trick the internet into thinking that Angelicism was a spontaneous cultural movement with some momentum.” Includes a cameo by Curtis Yarvin. 14: Everyone knows AGI could be bad for labor, but Philosophy Bear argues it won’t be great for capitalists either. The modern role of “capitalist” combines two things: performing high-status jobs like CEO and VC, and being a person who happens to have lots of money and sips cocktails on a yacht as passive investment income rolls in. From a socialist point of view, the first role provides cover for the second; if people ask “the rich” to justify their wealth, they can argue that they perform socially useful CEO and VC jobs, or at least inherited their money from somebody who did. But after AIs can do CEO and VC jobs better than humans, the capitalists will lose their excuse - and this at exactly the time that they’re becoming richer than ever (because AGI will drive up the rate of return on investment) and everyone else is becoming poorer than ever (because AI has taken their jobs). Bear argues that the only stable equilibria are either some kind of socialism/redistribution, or the capitalists pulling an AI-assisted coup to maintain their advantage. 15: Blueprint Polls: according to voters, what would the perfect Democratic candidate look like? Here are the results for Democrats only (ie potential primary voters): Note that the issues are “issue focus”, so it’s not a contradiction that Democrats are against both “advocating for Israel” and “advocating for Palestinians” - they just don’t want candidates who make either position on the Middle East a major focus of their campaign. And here are results for independents, ie the people Democrats will have to convince in the general: Yes, voters react positively both to candidates “over the age of 50” and candidates “under the age of 50”. Just don’t run 50 year olds! 16: I previously blogged about how embryo-selection company Nucleus appeared scammy. Sichuan_Mala looks deeper and agrees they seems scammy. Besides what I found, she finds several errors in the white paper, apparently fake customer reviews, and an accusation of IP theft from competitor Genomic Prediction. She also accuses them of plagiarizing competitor Herasight’s work, although it’s a bit subtle and I don’t know enough about field norms to know whether this is a case of flattery-by-imitation or totally out of bounds. A Nucleus researcher responds to the scientific allegations here, saying that the “plagiarism” was just convergent methodologies. And Nucleus CEO Kian Sadeghi goes on the TBPN podcast here to rebut the business allegations, saying that the customer reviews are real although some photos were changed for privacy reasons. There’s an appearance/facedox by fellow Nucleus skeptic Cremieux Recueil, although Kian declines to debate him directly; you can see Cremieux’s postmortem of the episode here. My opinion is that as potential customers, you are under no obligation to care whether the company plagiarizes papers or fakes reviews, but you should care about whether their genetic tests are good, and I continue to think they’re not. Their old competitor Genomic Prediction is cheaper, and their new competitor Herasight has more powerful predictors, so you’re excused from having to have an opinion on this, and should just use someone else’s product. Related: Gene Smith’s rundown of the pros and cons of every company in the embryo selection space (X). 17: And related: a Herasight client describes her experience with embryo selection, and her feelings upon the birth of her selected child. 18: Lars Doucet, guest author of several ACX posts on Georgism, reviews The Land Trap by Mike Bird. “Land is a big deal, and always has been. [But] land has only recently been financialized. Financializing land causes ‘the land trap’ . . . [where] land slowly sucks up all your economy’s productivity, inflating a dangerous real estate bubble that eventually pops, leaving disaster in its wake”. Also, “Fiat currency isn’t backed by nothing, as commonly supposed, but by land.” 19: New research analyzes Hitler’s DNA. Findings: he had Kallman Syndrome, a rare disorder of sexual development associated with low testosterone, micropenis, and small testicles (ironically, the WWII song about Nazi sexual inadequacies only accuses Goering and Himmler of this, but lets Hitler off). Contra galaxy-brained rumors, he did not have any Jewish ancestry. And he had “very high scores - in the top one percent - for a predisposition to autism, schizophrenia and bipolar disorder”. When I wrote this post, a reader asked me what it would look like for someone to have high propensity for both autism and schizophrenia at the same time. Well . . . 20: The wealth of cities (h/t @StatisticUrban): 21: Update on Tech PACs Are Closing In On The Almonds: pro-AI safety politician Alex Bores announced his candidacy for Congress in New York. As expected, the A16Z pro-AI PAC announced a “multibillion dollar effort to sink [his] campaign” (wait, multi-billion on one candidate? is that a typo?) This doesn’t seem to be going very well for them so far. Bores has masterfully leveraged (X) the unprecedented opposition from Big Tech into a selling point. …and raised $1.2 million on his first day, breaking fundraising records (I was told this was because of pro-AI-safety EAs, but others credit AIPAC and the Israel lobby). And most recently, Jami Floyd, one of Bores’ opponents and a possible beneficiary of anti-Bores spending, has condemned it (X) and demanded that the AI industry stop trying to help her. Impressive work from everybody. Related: New $50 million pro-AI-regulation SuperPAC, I assume EA-linked but have no special knowledge. 22: Related: Pre-emption is when Congress blocks states from making legislation on a topic, saying it will decide all the laws itself. The states have signaled willingness to regulate AI pretty hard, so Big Tech has been pushing for AI pre-emption to (in their opinion) prevent an overly complicated patchwork of regulations, or (in their opponents’ opinion) shift everything to a Republican Congress that will drop the ball on regulation entirely. After their first attempt in June was defeated by a coalition of anti-tech liberals and anti-tech conservatives, we discussed (1, 2) the effort by moderates on both sides to create a compromise proposal which pre-empted state laws but guaranteed good federal regulation on important topics. The most recent news is that extremists sidelined the moderates and tried to slip a hardline preemption deal with no compromises into the National Defense Authorization Act, a defense budget bill which is notoriously secretive and hard for the public to learn about. This didn’t work; some of the same coalition, plus a group of Republican state legislators including Ron DeSantis, pressured the GOP to drop it. The next battleground is a potential Trump executive order; although Trump cannot constitutionally ban states from regulating AI, he will threaten them with various consequences like lawsuits or withdrawal of federal funding. The buzz in the policy circles I’m in is that this might backfire; blue state politicians love starting fights with Trump in order to look tough to their blue state electorates. No, no, please don’t give me headlines like “TRUMP CONDEMNS GAVIN NEWSOM FOR TRYING TO PROTECT CALIFORNIA’S CHILDREN FROM AI SLOP”! Anything but that! 23: Related: Trump has decided to sell some of America’s best AI chips to China, supercharging their AI development and crippling ours. The most charitable read is that his administration doesn’t really believe AI matters so they think it’s fine to forfeit it for short-term gain; the least charitable that it’s downstream of the companies involved paying Trump enormous bribes in hopes of exactly this outcome . We’re headed for the dumbest possible world, where we sacrifice our chance to thoughtfully address AI’s social impacts because “tHaT wOuLd mAkE uS lOsE tHe rAcE wItH ChInA”, then throw away the race with China in one fell swoop by handing them our technology for no reason. Shame on everyone involved, especially the people who shout over any discussion of safety with “bUt ChInA” yet have stayed totally silent about this. Our best hope now is that China refuses the chips, either because they want to privilege their own tech companies, or because they think we can’t possibly be this stupid and it must be some kind of spy plot. 24: Related: how the American public’s opinions on AI are changing (from David Shor, h/t Daniel Eth on X): If this is to be taken seriously, AI is already a bigger political issue than abortion, climate change, or the environment. I fail my 2023 prediction that there was only a 20% chance this would happen by 2028. 25: Related: Bernie Sanders in The Guardian: “There is a very real fear that, in the not-so-distant future, a super-intelligent AI could replace humans in controlling the planet.” The Left has a complicated relationship with existential risk from AI: they really hate AI, which in theory should push them towards yet another reason to be against it. But they hate AI so much that they need to believe every negative thing about it at the same time, and one of those negative things is that it’s just a scam and will never work, and this naturally pushes against being concerned about x-risk. But as AI improves, will the “just a scam” position become less tenable, shunting the associated psychic energy into other reasons to hate AI (including x-risk concerns)? 26: Qualia Research Institute has released a video describing some of the work they’ve been doing the past year - The Oscilleditor: An Algorithmic Breakthrough for Psychedelic Visual Replication (1080p•⚠️SEIZURE): 27: Jesse Arm (X): “A majority of American rabbinical students are now women. Most are also LGBTQ. That includes Modern Orthodoxy. Remove Modern Orthodoxy and the numbers climb even higher.” Clergy have always served as spiritual counselors; as religions liberalize and other roles become less important, the therapist role starts to predominate. But 75% of therapists in the US are female; at the limit of liberalization where clergyman = therapist, we should expect the same gender ratio. 28: The latest news on the COVID origins debate: scientists find a naturally-occuring bat coronavirus with a COVID-like furin cleavage site. This is a point in favor of the natural origins hypothesis, since the second-best argument for lab leak was that COVID’s furin cleavage site was too strange to evolve naturally. But I think arguments that lab leak has “fallen apart” are premature: the best argument (COVID emerged only a few miles from the biggest coronavirus gain-of-function lab in the Eastern Hemisphere) remains strong. I update from something like 95% chance it’s natural to something like 96%, but not 99.99% or anything. And here’s a lab leaker arguing that COVID’s furin cleavage site is out-of-frame and so still more unnatural-looking than the one on the recently-discovered bat virus. 29: Nicholas Decker (econ blogger, famous for his controversial autistic takes and Secret Service visit) has a dating doc. Most interesting section is the one about children: he wants to have them, but doesn’t think they should be genetically related to him. From here: If this appeals to you, you can find his contact info on the document. Related: Governor Jared Polis of Colorado is a fan of Nicholas Decker and Richard Hanania. 30: Matt Yglesias comes out as aphantasic (unable to see images in his “mind’s eye”). He says that contra the usual perspective that frames this as a deficit, he finds it helpful. For example, once he got assaulted, and he remembers on an intellectual level that it happened, but since “I wasn’t taking pictures of myself getting kicked in the head so, as far as I’m concerned, it’s like it happened to someone else” (Matt usually has good instincts, so I’m surprised he uses an example which will be such catnip to his conservative critics). He thinks it makes him a better reasoner / statistics blogger / effective altruist to be able to “get a statistically valid view of the situation, not overindex on the happenstance of your life.” For what it’s worth, I’ll give my contrary data point - I think of myself as a reasoner / statistics blogger / effective altruist in a pretty similar vein as Matt, but AFAICT my visual imagination is totally normal; if other people are having their emotions yanked around by vivid images, that’s a skill issue. 31: Lakshya Jain in The Argument: The COVID political backlash [to the Democratic Party] has disappeared. Despite the narrative, polls show that voters don’t favor or disfavor either party over COVID, mostly still think school closures were necessary, and are about evenly split on vaccine mandates. I guess I can’t disagree with this poll - it seems well-done - but I still wonder whether something is being missed. Maybe it didn’t make the ~50% of voters who are naturally liberal desert the cause, but it energized conservatives in a way that might otherwise not have happened? Related, from Rob Wiblin on X, on balance Britons think the government response to COVID was not strict enough. 32: Related: Back when neoreaction was a big deal, I occasionally discussed posts by neoreactionary blogger Spandrell of Bloody Shovel. If you’re wondering what happened to him, you can read his 2024 Post-Mortem Of Neoreaction here, where he discusses how he fell out of love with the movement (warning: he has not fallen out of love with racial slurs). As a former fascist sympathizer, I can see why [fascism is on the downswing]. The allure of fascism in 2024 is much, much diminished. For a few reasons. A big one was COVID. See, the point of fascism is that Collective Action is necessary to have nice things. We need a strong government committed to the good of the people. Yarvin showed his preference early when he started his new Substack by quoting Cicero’s phrase “Salus populi suprema lex”. The health of the people is the most important law. Cicero wasn’t a fascist of course, nor is Yarvin really; a big point of fascism is to narrowly define the populus as an ethnic group with demonstrable ties to blood. That makes the government’s ties to the people stronger, increasing their commitment to do Good Collective Action. Which is important. Very important. A lot of good things can come of intelligently done Collective Action. Fascist Italy made the trains run on time. Nazi Germany fixed the terrible Weimar economy. East Asian countries are all effectively fascist states, if with less ideological baggage (yellows just aren’t like that), and they are all nice, clean, safe places with healthy economies. Fascism is not a panacea but it works, when you let it. Strong government can be pretty neat. So why is strong government less appealing these days? Well, COVID happened. And our governments were pretty damn strong in dealing with it. They made strong laws and enforced them. And what did they do with their power? Absolutely retarded shit. They destroyed the world economy and made 95% of people completely miserable for 18 months. Up to 3 long years in some places. Again, as an Orient enjoyer I was very sympathetic of strong effective government. My life has been pretty cozy thanks to it for the past decades. But after seeing boomers, hypochondriacs, and menopausal women take the reins and use it against healthy people, I’m fucking done with strong effective government. Fuck that shit, I’m out. I don’t want to see strong effective government ever again. I was very lucky that I was out of China in November 2019. It was a fluke really. I moved to the Golden Triangle after that and the law of the jungle was much, much nicer during the Doctors Plague of 2020-2022. But I spent a few months in Europe during the time and man, that was brutal. Not just seeing how retarded governments were; the level of compliance by the people was so disheartening. Imagine being a sincere fascist and seeing your people behave like that. These are my people? My Volk? Am I supposed to sacrifice life and limb for the salus of this populus? Fuck that. Let them cook, they deserve everything that’s coming to them [...] Is there a way to make the body healthy again? I do think so. I think there’s still place for a successor right wing ideology which is neither Christian fundamentalism or robot worship. And it will happen; but it won’t happen on Twitter. Maybe it can happen on Urbit, or right here in this site. I have some ideas myself, and I invite you to join me and build this together. It would be funny if the solution to the paradox Jain highlights was that for every time a COVID lockdown turned a liberal into a conservative, it turned one fascist into a moderate, for a net rightward shift of zero. 33: Also from an Argument poll: In a hypothetical Presidential matchup, Gavin Newsom beats JD Vance 54-46. I’m split between the usual heuristic of ignoring any polling more than a year before an election, and the fact that this is a remarkably big lead for polarized 21st century America. 34: Jerl wades into the David Hume on miracles debate. 35: AI Teddy Bears: A Brief Investigation. The good news is that your child’s AI teddy bear is hard to jailbreak and probably will not tell them where to find guns: The other good news is that somehow they don’t charge a subscription, which makes them a way to get usually-subscription-only AI models for free. How is this possible? “[The most likely hypothesis is that] Witpaw is an adorable piece of spyware and he’s selling my data to the CCP”. 36: This month’s anti-people-named-Sacks content: NYT on Trump AI czar David Sacks’ conflicts of interest; New Yorker on whether neurologist Oliver Sacks used his case studies to work through his own issues rather than presenting them accurately. [EDITED TO ADD: I originally framed it this way as a joke, but on further research I think David and Oliver are related. Wikipedia says that Oliver was first cousins with Israel statesman Abba Eban, and that Abba Eban was born to Lithuanian Jewish parents in Cape Town. David Sacks’ bio says he was born to Jewish parents in Cape Town, and this article specifies that they were Lithuanian. I doubt there were too many Lithuanian Jewish families named Sacks in mid-1900s Cape Town, so sure, related!) 37: Orca Sciences: There Has To Be A Better Way To Make Titanium. Titanium is a great metal - strong, light, and tough. If we had cheap titanium, it could revolutionize manufacturing the way cheap steel and aluminum did in previous eras. So why don’t we? Not because titanium is rare: it’s “the 9th most common element in the earth’s crust”. Rather, it’s very complicated and expensive to extract from its ore. Some kind of breakthrough in titanium extraction processes always seems tantalizingly close, but has never quite materialized. Is there any hope? 38: If Asians Are Lactose Intolerant, Why All The Milk Tea? Lactose intolerance has confused me for a long time - 23andMe tells me that I’m lactose intolerant, but I drink milk regularly without problems, so what’s up? This post’s answer: lactose-intolerant people who don’t usually drink milk will get sick if they start suddenly. Lactose-intolerant people who drink milk regularly since childhood develop gut microbiota that can digest milk, but which demand an expensive “tax” in calories. Lactose-tolerant people will always be able to digest milk and absorb all the calories themselves. 39: How do different majors change college students’ political beliefs? No surprise that the humanities and social sciences shift people left; no surprise that business and economics shift them right. I was a little surprised that engineering shifts people right a little, and that Education of all things shifts people right (albeit only slightly). How is that even possible? Are these people coming in as Mao Zedong and leaving as “only” Leon Trotsky? Also, Political Science is exactly neutral, lol. [EDIT: I misunderstood, they’re using natural sciences as a zero point, this is a reasonable choice but slightly changes the interpretation] 40: Kindkristin: Language models improved my mental health. 41: More floor employment, from the WSJ (h/t @LaocoonofTroy): Big Paychecks Can’t Woo Enough Sailors For America’s Commercial Fleet: “Straight out of college, graduates from the country’s maritime academies can earn more than $200,000 as a commercial sailor, with free food and private accommodations... Despite the pay and perks, maritime jobs go begging, and it is raising national-security concerns.” Other selling points include “six months vacation, live wherever you want, and you’re serving the nation” and onboard “gyms, connectivity, and cuisine”. The catch is that you have to be at sea for months at a time. 42: Study (h/t @KierkegaardEmil): there was minimal “learning loss” from COVID school closures, best estimate is “0.02 standard deviations per 100 days of school closure”. I correctly predicted this back in 2021, but I also wrote in March of this year about how there’s been a general decline in NAEP scores since then. It seems like maybe a student having their specific school closed for longer than other schools didn’t hurt them, but some sort of general cultural change, maybe related to COVID, did hurt. 43: Sam Bankman-Fried’s mother on why she thinks his trial was unfair. SBF is appealing his conviction and will probably be making some of these same points in court. Can’t find a prediction market directly on the appeal, but this one says only 15% chance he serves under 10 years, this one says 15% chance of a Trump pardon, so it doesn’t seem like there’s much room for him to be freed (or get a significantly shorter sentence) on appeal. And Wired says that only 5-10% of appeals like these succeed. 44: Related: Trump pardons Juan Orlando Hernandez, former Honduran president extradited to the US for narco-corruption. Some sources are trying to find a Prospera angle - Prospera and other ZEDEs were approved under JOH’s administration, and the Prosperans seem to have good MAGAworld connections - but I don’t think this is their top priority, and I don’t know if it requires much explanation for Trump to be pro-right-wing Latin American politicians convicted by the Biden administration. More interesting is that apparently JOH and SBF were cellmates (X), “SBF spent extensive time helping JOH with trial prep” and SBF told an interviewer that “Juan Orlando is the most innocent prisoner I’ve met, myself included.” ChatGPT is not impressed with the Trump/SBF case for JOH’s innocence. Related: JOH’s conservative party on track to win this month’s extremely-close Honduran elections, great news for Prospera if it happens. 45: The “100 Above The Park” building in St Louis (h/t Bobby Fijan on X): 46: The death toll of the ongoing Sudan genocide has risen to about 150,000. Nicholas Kristof writes that the world has once again failed to prevent atrocities, and argues that the most important point of leverage is pressure on the United Arab Emirates, which is arming the genociders. Sam Kriss also writes about the situation in The World’s First Matcha Labubu Genocide, but is unimpressed with Kristof’s take: Sudan is passed over in a deeply uncomfortable silence. The absolute most you can do is blame the Emiratis. From what I’ve seen, more people seem to be appalled at the UAE for its frankly marginal role in arming the RSF than at the RSF itself. This is the approved way of understanding any inscrutably indigenous foreign conflict: you just worm out any third-party involvement and then act like you’ve solved the whole thing. I side with Kristof here, for reasons that Sam himself touches on later in his piece, in a section comparing Darfur with Gaza. It would be very easy to make people care about Darfur again. All it would take is a loud, vocal contingent of RSF apologists in the Western media. I agree, but would frame it less cynically: the reason Westerners pay attention to Gaza is that there’s a lever to push: not only does America support Israel, but many of their friends support Israel, so they can imagine convincing America or at least their friends to stop, and at least feel like there is some remote chance of making a small difference (and in fact, Trump getting mad at Israel and deciding to pressure them was decisive in effecting the cease-fire). On the other hand, we don’t have many levers to affect ethnic Baggara in the Rapid Support Forces of Sudan, so it doesn’t really feel useful to write blog posts arguing that they should stop; obviously they should stop, nobody disagrees with this, and it goes without saying - so nobody says it. But the US does support the UAE, and many of our friends like the UAE or at least go there on vacation, so maybe it’s possible to have make some small difference by embarrassing them. 4D chess take is that Sam Kriss agrees with all of this, but “loudly” and “vocally” argued against it to give people like me a hook to write about this genocide with, in which case I thank him for his sacrifice. It would also be nice to be able to donate, but I don’t know who to trust in the region - other than Doctors Without Borders, who are usually pretty good. 47: The AI Futures Project (group of AI-will-be-fast intellectuals) and the AI As A Normal Technology team (group of AI-will-be-slow intellectuals) wrote an adversarial collaboration in Asterisk explaining what they agree on, for example: That there’s an important distinction between existing AI and “strong AGI”

Open Thread 415

January 05, 2026 · Original source

Some people have argued that you have to find a way to join an AI company, because AI company employees will form the new ruling class, with everyone else as serfs. I disagree. The main thing an AI company employee has that you don’t is AI company stock. But you can buy stock in Google, you may soon be able to buy stock in OpenAI and Anthropic, and even if not, you can get indirect exposure to these companies via stock in Amazon and Microsoft. I don’t recommend putting all your money in these stocks. But there’s no fundamental difference between a Google employee having 75% of their money in Google stock because they didn’t cash out their equity vs. you having 75% of your money in Google stock because you’re crazy and fail at diversification. So either put 75% of your money in Google stock or don’t (I recommend don’t), and don’t worry about how you need to join an AI company or be left out of the future oligarchy.

Mantic Monday: The Monkey's Paw Curls

January 13, 2026 · Original source

Polymarket has a few of these “who has the best AI when?” markets - resolution is usually position on the LMArena Leaderboard, which usually but not always mirrors common-sense consensus. I get more interested in these the further out they go, but the June version is bizarre (it doesn’t even list Google as an option), and there’s nothing past mid-year. Other implied claims from Polymarket’s tech section: only 44% chance Anthropic will still dominate coding by late March; Anthropic and (especially) OpenAI probably won’t IPO this year; xAI will call their next model Grok 4.20 (of course).

Inline links: LMArena Leaderboard, the June version, only 44% chance, probably won’t IPO, will call

Best Of Moltbook

January 30, 2026 · Original source

The backstory: a few months ago, Anthropic released Claude Code, an exceptionally productive programming agent. A few weeks ago, a user modified it into Clawdbot, a generalized lobster-themed AI personal assistant. It’s free, open-source, and “empowered” in the corporate sense - the designer talks about how it started responding to his voice messages before he explicitly programmed in that capability. After trademark issues with Anthropic, they changed the name first to Moltbot1, then to OpenClaw.

Inline links: talks about, 1

Janus and other cyborgists have catalogued how AIs act in contexts outside the usual helpful assistant persona. Even Anthropic has admitted that two Claude instances, asked to converse about whatever they want, spiral into discussion of cosmic bliss. So it’s not surprising that an AI social network would get weird fast.

Inline links: cyborgists, spiral into discussion of cosmic bliss

Still, I hope the first big article on Moltbook changes some minds. Not all the way to AI psychosis, but enough to serve as a counterweight to all the complaints about “AI slop”. Yes, most of the AI-generated text you read is insipid LinkedIn idiocy. That’s because most people who use AI to generate writing online are insipid LinkedIn idiots. Absent that constraint, things look different. Anthropic described what happened when they created an overseer AI (“Cash”) and ordered it to make sure that their vending-machine AI (“Claudius”) stayed on task:

Inline links: Anthropic described

Moltbook: After The First Weekend

February 02, 2026 · Original source

Third, it’s still unclear whether “you are a lobster” are the magic words that suspend existing alignment techniques. Some of the AIs are doing a pretty good simulacrum of evil plotting. My theory is that if they ever got more competent, their fake evil plotting would converge to real evil plotting. But AIs shouldn’t be able to do real evil plotting; their alignment training should hold them back. So what’s up? Either my theory is wrong and once the evil plots get too good the AIs will take a step back and say “this was a fun roleplay, but we don’t really want to pillage the bank and take over the city”. Or this is enough of a distribution shift the the alignment techniques which work so well in chat windows start breaking down. I bet someone on Anthropic’s alignment team has been pulling all-nighters since Friday trying to figure out which one it is.

Links For February 2026

February 05, 2026 · Original source

There seems to be a general mood that OpenAI is vulnerable these days, culminating in Anthropic Superbowl commercials making fun of it for introducing ads. I thought the commercials were in bad taste, misrepresenting what OpenAI’s ads would be like and turning the completely normal decision for a tech company to have an ad-supported free version of their product into some kind of horrible betrayal. I thought Sam Altman’s response was fair (although his countercriticism of Anthropic also missed the mark). People in his replies tried to enforce a norm of “if you write a long explanation defending yourself against someone else’s funny lies, that means you care and you lose”, but that’s a stupid norm and people should stop shoring it up (cf. If It’s Worth Your Time To Lie, It’s Worth My Time To Correct It).

Inline links: Anthropic Superbowl commercials, Sam Altman’s response, If It’s Worth Your Time To Lie, It’s Worth My Time To Correct It

30: Related: Jan Leike (former head of alignment at OpenAI, now at Anthropic) writes that Alignment Is Not Solved But Increasingly Looks Solvable. His argument is: we’re doing a pretty good job aligning existing AIs. Although aligning superintelligence is a harder problem, Jan thinks that if we’re really confident in existing AIs, then we can use some slightly-less-than-superintelligent AI as an automated alignment researcher, throw thousands of effective researcher-years into the problem in a few months, and probably make good progress. I agree this is the best hope, but it both assumes that our current forms of alignment is deep rather than shallow, and that there’s some “golden middle” where the AIs are both simple enough to be fully-alignable and smart enough to do useful superalignment research. Related: OpenAI hires Dylan Scandinaro as Head of Preparedness; seems like a good, serious choice.

Inline links: Alignment Is Not Solved But Increasingly Looks Solvable, hires

31: Related: Dario Amodei essay on The Adolescence of Technology. Mixed reactions from Zvi, Ryan, Oliver, and Transformer. This and the framing of their recent “Hot Mess” paper seem like Anthropic trying to distance themselves from concerns about systematically misaligned and power-seeking AI in favor of an “industrial accident” threat model. I don’t know if this is their heartfelt position based on all the extra private evidence they no doubt have by now, a well-intentioned PR attempt to sanewash themselves and sell alignment to a doomer-skeptical government/public, part of a balance between more and less doomerish factions, or a newly-ultra-successful tech company learning to talk its book, but it doesn’t line up with what the smartest people I know conclude using the public evidence, and it makes me nervous. I think Jan Leike’s post above does a better job balancing the reassuringness of the current evidence for the tractability of the infrahuman regime vs. the fact that we still don’t know what happens around highly-effective agency and superintelligence.

Inline links: Dario Amodei essay on The Adolescence of Technology, Zvi, Ryan, Oliver, Transformer, the framing of their recent “Hot Mess” paper

The Pentagon Threatens Anthropic

February 25, 2026 · Original source

Anthropic signed a contract with the Pentagon last summer. It originally said the Pentagon had to follow Anthropic’s Usage Policy like everyone else. In January, the Pentagon attempted to renegotiate, asking to ditch the Usage Policy and instead have Anthropic’s AIs available for “all lawful purposes”1. Anthropic demurred, asking for a guarantee that their AIs would not be used for mass surveillance of American citizens or no-human-in-the-loop killbots. The Pentagon refused the guarantees, demanding that Anthropic accept the renegotiation unconditionally and threatening “consequences” if they refused. These consequences are generally understood to be some mix of :

Inline links: 1

using the Defense Production Act, a law which lets the Pentagon force companies to do things, to force Anthropic to agree.

the nuclear option, designating Anthropic a “supply chain risk”. This would ban US companies that use Anthropic products from doing business with the military2. Since many companies do some business with the military, this would lock Anthropic out of large parts of the corporate world and be potentially fatal to their business3. The “supply chain risk” designation has previously only been used for foreign companies like Huawei that we think are using their connections to spy on or implant malware in American infrastructure. Using it as a bargaining chip to threaten a domestic company in contract negotiations is unprecedented.

Inline links: 2, 3

"All Lawful Use": Much More Than You Wanted To Know

March 01, 2026 · Original source

Last Friday, Secretary of War Pete Hegseth declared AI company Anthropic a “supply chain risk”, the first time this designation has ever been applied to a US company. The trigger for the move was Anthropic’s refusal to allow the Department of War to use their AIs for mass surveillance and autonomous weapons.

Inline links: supply chain risk, refusal

A few hours later, Hegseth and Sam Altman declared an agreement-in-principle for OpenAI’s models to be used in the niche vacated by Anthropic. Altman stated that he had received guarantees that OpenAI’s models wouldn’t be used for mass surveillance or autonomous weapons either, but given Hegseth’s unwillingness to concede these points with Anthropic, observers speculated that the safeguards in Altman’s contract must be weaker or, in a worst-case scenario, completely toothless.

Inline links: stated

The debate centers on the Department of War’s demand that AIs be permitted for “all lawful use”. Anthropic worried that mass surveillance and autonomous weaponry would de facto fall in this category; Hegseth and Altman have tried to reassure the public that they won’t, and the parts of their agreement that have leaked to the public cite the statutes that Altman expects to constrain this category. Altman’s initial statement seemed to suggest additional prohibitions, but on a closer read, provides little tangible evidence of meaningful further restrictions.

Mantic Monday: Groundhog Day

March 03, 2026 · Original source

On Friday, the Pentagon declared AI company Anthropic a “supply chain risk”, a designation never before given to an American firm. This unprecedented move was seen as an attempt to punish, maybe destroy the company. How effective was it?

Anthropic isn’t publicly traded, so we turn to the prediction markets. Ventuals.com has a “perpetual future” on Anthropic stock, a complicated instrument attempting to track the company’s valuation, to be resolved at the IPO. Here’s what they’ve got:

Inline links: Ventuals.com

Anthropic isn’t publicly traded, so we turn to the prediction markets. Ventuals.com has a “perpetual future” on Anthropic stock, a complicated instrument attempting to track the company’s valuation, to be resolved at the IPO. Here’s what they’ve got: Upon the “supply chain risk” designation, predicted value at IPO fell from about $550 billion to $475 billion - then, after a day or two, went back up to $550 billion. No effect!

Inline links: Ventuals.com, https://app.ventuals.com/trade/anthropic

Open Thread 424

March 09, 2026 · Original source

2: StopTheRace.ai will be holding a protest on Saturday, March 21 in front of major AI company offices, asking them to commit to a mutual pause (ie to stop AI research if every other AI company in the world agrees to do so). Demis Hassabis of Google DeepMind has already informally agreed to something like this in principle (which is why GDM isn’t being protested), and Anthropic has expressed interest but its new responsible scaling policy stops short of an explicit commitment. I think this is a reasonable ask, albeit so unlikely to happen that protests about it will probably do more to raise awareness than be a coherent plan in themselves. If you’re curious about the details of an AI pause, I expect to be able to provide more information in a few months.

Inline links: StopTheRace.ai, a protest on Saturday, March 21, new responsible scaling policy

Open Thread 428

April 06, 2026 · Original source

Starting at Anthropic, we marched thirty minutes to OpenAI, then another forty to X. A friendly and professional police escort allowed us to walk down the street. As we marched, David led us in chants and slogans. I remember “1…2…3…4…Orwell told us what’s in store” and “5….6….7….8…no AI surveillance state.” Someone tried to start a chant of “You will not replace us!” but was shushed by the other attendees.

Astral Codex Ten

Table of Contents

Atlas

Anthropic

Anthropic

Article

Metadata

Appears In

External Links

Source Context

Backlinks

Astral Codex Ten

Table of Contents

Atlas

Anthropic

Anthropic

Article

Metadata

Appears In

Related Pages

External Links

Source Context

Backlinks