polysemanticity

Article

polysemanticity is a recurring concept in the Astral Codex Ten archive, appearing 2 times across 2 issues between November 27, 2023 and March 07, 2025. The archive places it in contexts such as ""The authors described this as ‘polysemanticity’ - multiple meanings for one neuron""; “The best real answer I can come up with is polysemanticity and superposition”. It most often appears alongside An Introduction To Circuits, Anthropic, Anthropic interpretability team.

Metadata

  • Category: Concepts
  • Mention count: 2
  • Issue count: 2
  • First seen: November 27, 2023
  • Last seen: March 07, 2025

Appears In

Source Context

Recovered passages from the original issue text. When the raw archive preserved outbound links inside the source passage, they are listed directly under the quote.

November 27, 2023 · Original source
Second, this doesn’t work. When you switch to a weaker AI with “only” a few hundred neurons and build special tools to automate the stimulus/analysis process, the neurons aren’t this simple. A few low-level ones respond to basic features (like curves in an image). But deep in the middle, where the real thought has to be happening, there’s nothing representing “dog”. Instead, the neurons are much weirder than this. In one image model, an earlier paper found “one neuron that responds to cat faces, fronts of cars, and cat legs”. The authors described this as “polysemanticity” - multiple meanings for one neuron.
March 07, 2025 · Original source
The best real answer I can come up with is polysemanticity and superposition. Everyone has more concepts they want stored than neurons to store them, so they cram multiple concepts into the same neuron through a complicated algorithm that involves some loss of . . . fidelity? Usability? Precision? If you have too few neurons, the neurons have to become massively polysemantic, and it becomes harder to do anything in particular with them.
more neurons in the brain -> there are more possible configurations of firing -> it's a "richer" language. one consequence of this is that you need less polysemanticity, as you said.