SolidGoldMagikarp

Article

SolidGoldMagikarp is a recurring Instagram account in the Astral Codex Ten archive, appearing 2 times across 2 issues between February 09, 2023 and February 20, 2023. The archive places it in contexts such as “including the string “SolidGoldMagikarp””; “not SolidGoldMagikarp style”. It most often appears alongside California, COVID, GPT-2.

Metadata

  • Category: Instagram Accounts
  • Mention count: 2
  • Issue count: 2
  • First seen: February 09, 2023
  • Last seen: February 20, 2023

Appears In

Source Context

Recovered passages from the original issue text. When the raw archive preserved outbound links inside the source passage, they are listed directly under the quote.

February 09, 2023 · Original source
(source) 22: Related: the very center of GPT’s embedding space contains a few unusual tokens including the string “SolidGoldMagikarp”. GPT displays anomalous behavior if these tokens are inserted in a query; for example, it treats “SolidGoldMagikarp” as the word “distribute”. ChatGPT is pretty advanced and fails semi-gracefully here; GPT-2’s reaction to these tokens is more disturbing: (source: Less Wrong) Further investigation determined that many of these tokens are the screen names of a group of Redditors who attempted to count to infinity. The most likely explanation, according to the discoverers, is that these names were in GPT’s tokenization data, but not its training data (maybe they were especially common in the tokenization data because they made thousands of posts with numbers in them, but didn’t make it into the training data because their posts had no content?) - that leaves them existing without content, and GPT tries to round them off to some other “nearby” token (by incomprehensible AI standards of nearbyness). Congrats to the SERI-MATS AI alignment researchers who found all of this; maybe this makes it 0.0001% less likely that the AI which controls the nuclear arsenal in twenty years will have equally inexplicable behavior. 23: More language model news: LLM that understands and can explain images
(source: Less Wrong) Further investigation determined that many of these tokens are the screen names of a group of Redditors who attempted to count to infinity. The most likely explanation, according to the discoverers, is that these names were in GPT’s tokenization data, but not its training data (maybe they were especially common in the tokenization data because they made thousands of posts with numbers in them, but didn’t make it into the training data because their posts had no content?) - that leaves them existing without content, and GPT tries to round them off to some other “nearby” token (by incomprehensible AI standards of nearbyness). Congrats to the SERI-MATS AI alignment researchers who found all of this; maybe this makes it 0.0001% less likely that the AI which controls the nuclear arsenal in twenty years will have equally inexplicable behavior. 23: More language model news: LLM that understands and can explain images
February 20, 2023 · Original source
Gary Marcus can still figure out at least three semi-normal (ie not SolidGoldMagikarp style) situations where the most advanced language AIs make ridiculous errors that a human teenager wouldn’t make, more than half the time they’re asked the questions: 30%