SolidGoldMagikarp
Article
SolidGoldMagikarp is a recurring Instagram account in the Astral Codex Ten archive, appearing 2 times across 2 issues between February 09, 2023 and February 20, 2023. The archive places it in contexts such as “including the string “SolidGoldMagikarp””; “not SolidGoldMagikarp style”. It most often appears alongside California, COVID, GPT-2.
Metadata
- Category: Instagram Accounts
- Mention count: 2
- Issue count: 2
- First seen: February 09, 2023
- Last seen: February 20, 2023
Appears In
Related Pages
-
- California (2 shared issues)
-
- COVID (2 shared issues)
-
- GPT-2 (2 shared issues)
-
- Iran (2 shared issues)
-
- Israel (2 shared issues)
-
- Italy (2 shared issues)
-
- US (2 shared issues)
-
- YouTube (2 shared issues)
-
- 2020 election (1 shared issues)
-
- 2020 primary (1 shared issues)
-
- 23andme (1 shared issues)
-
- @moritheil (1 shared issues)
External Links
Source Context
Recovered passages from the original issue text. When the raw archive preserved outbound links inside the source passage, they are listed directly under the quote.
(source) 22: Related: the very center of GPT’s embedding space contains a few unusual tokens including the string “SolidGoldMagikarp”. GPT displays anomalous behavior if these tokens are inserted in a query; for example, it treats “SolidGoldMagikarp” as the word “distribute”. ChatGPT is pretty advanced and fails semi-gracefully here; GPT-2’s reaction to these tokens is more disturbing: (source: Less Wrong) Further investigation determined that many of these tokens are the screen names of a group of Redditors who attempted to count to infinity. The most likely explanation, according to the discoverers, is that these names were in GPT’s tokenization data, but not its training data (maybe they were especially common in the tokenization data because they made thousands of posts with numbers in them, but didn’t make it into the training data because their posts had no content?) - that leaves them existing without content, and GPT tries to round them off to some other “nearby” token (by incomprehensible AI standards of nearbyness). Congrats to the SERI-MATS AI alignment researchers who found all of this; maybe this makes it 0.0001% less likely that the AI which controls the nuclear arsenal in twenty years will have equally inexplicable behavior. 23: More language model news: LLM that understands and can explain images
Inline links: source, contains a few unusual tokens, https://substackcdn.com/image/fetch/$s_!rkvW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdeb0cd39-6942-44c0-85ef-0fac91b362e3_536x147.png, https://substackcdn.com/image/fetch/$s_!QIzo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba1d0783-5826-48a3-9ead-1e803042e96d_622x631.png, Less Wrong, many of these tokens are the screen names of a group of Redditors who attempted to count to infinity, SERI-MATS, LLM that understands and can explain images
(source: Less Wrong) Further investigation determined that many of these tokens are the screen names of a group of Redditors who attempted to count to infinity. The most likely explanation, according to the discoverers, is that these names were in GPT’s tokenization data, but not its training data (maybe they were especially common in the tokenization data because they made thousands of posts with numbers in them, but didn’t make it into the training data because their posts had no content?) - that leaves them existing without content, and GPT tries to round them off to some other “nearby” token (by incomprehensible AI standards of nearbyness). Congrats to the SERI-MATS AI alignment researchers who found all of this; maybe this makes it 0.0001% less likely that the AI which controls the nuclear arsenal in twenty years will have equally inexplicable behavior. 23: More language model news: LLM that understands and can explain images
Gary Marcus can still figure out at least three semi-normal (ie not SolidGoldMagikarp style) situations where the most advanced language AIs make ridiculous errors that a human teenager wouldn’t make, more than half the time they’re asked the questions: 30%