polysemanticity
Article
polysemanticity is a recurring concept in the Astral Codex Ten archive, appearing 2 times across 2 issues between November 27, 2023 and March 07, 2025. The archive places it in contexts such as ""The authors described this as ‘polysemanticity’ - multiple meanings for one neuron""; “The best real answer I can come up with is polysemanticity and superposition”. It most often appears alongside An Introduction To Circuits, Anthropic, Anthropic interpretability team.
Metadata
- Category: Concepts
- Mention count: 2
- Issue count: 2
- First seen: November 27, 2023
- Last seen: March 07, 2025
Appears In
- God Help Us, Let’s Try To Understand AI Monosemanticity
- Why Should Intelligence Be Related To Neuron Count?
Related Pages
-
- An Introduction To Circuits (1 shared issues)
-
- Anthropic (1 shared issues)
-
- Anthropic interpretability team (1 shared issues)
-
- biscornu (1 shared issues)
-
- Book 14 (1 shared issues)
-
- Claude (1 shared issues)
-
- Claude (1 shared issues)
-
- Cunningham et al. (1 shared issues)
-
- Distribution Representations: Composition & Superposition (1 shared issues)
-
- God From The Machine (1 shared issues)
-
- Godzilla (1 shared issues)
-
- GPT (1 shared issues)
External Links
Source Context
Recovered passages from the original issue text. When the raw archive preserved outbound links inside the source passage, they are listed directly under the quote.
Second, this doesn’t work. When you switch to a weaker AI with “only” a few hundred neurons and build special tools to automate the stimulus/analysis process, the neurons aren’t this simple. A few low-level ones respond to basic features (like curves in an image). But deep in the middle, where the real thought has to be happening, there’s nothing representing “dog”. Instead, the neurons are much weirder than this. In one image model, an earlier paper found “one neuron that responds to cat faces, fronts of cars, and cat legs”. The authors described this as “polysemanticity” - multiple meanings for one neuron.
Inline links: earlier paper
The best real answer I can come up with is polysemanticity and superposition. Everyone has more concepts they want stored than neurons to store them, so they cram multiple concepts into the same neuron through a complicated algorithm that involves some loss of . . . fidelity? Usability? Precision? If you have too few neurons, the neurons have to become massively polysemantic, and it becomes harder to do anything in particular with them.
Inline links: polysemanticity and superposition
more neurons in the brain -> there are more possible configurations of firing -> it's a "richer" language. one consequence of this is that you need less polysemanticity, as you said.