SolidGoldMagikarp

Article

SolidGoldMagikarp is a recurring Instagram account in the Astral Codex Ten archive, appearing 2 times across 2 issues between February 09, 2023 and February 20, 2023. The archive places it in contexts such as “including the string “SolidGoldMagikarp””; “not SolidGoldMagikarp style”. It most often appears alongside California, COVID, GPT-2.

Metadata

Category: Instagram Accounts
Mention count: 2
Issue count: 2
First seen: February 09, 2023
Last seen: February 20, 2023

Appears In

- California (2 shared issues)
- COVID (2 shared issues)
- GPT-2 (2 shared issues)
- Iran (2 shared issues)
- Israel (2 shared issues)
- Italy (2 shared issues)
- US (2 shared issues)
- YouTube (2 shared issues)
- 2020 election (1 shared issues)
- 2020 primary (1 shared issues)
- 23andme (1 shared issues)
- @moritheil (1 shared issues)

External Links

Source Context

Recovered passages from the original issue text. When the raw archive preserved outbound links inside the source passage, they are listed directly under the quote.

Links For February 2023

February 09, 2023 · Original source

(source) 22: Related: the very center of GPT’s embedding space contains a few unusual tokens including the string “SolidGoldMagikarp”. GPT displays anomalous behavior if these tokens are inserted in a query; for example, it treats “SolidGoldMagikarp” as the word “distribute”. ChatGPT is pretty advanced and fails semi-gracefully here; GPT-2’s reaction to these tokens is more disturbing: (source: Less Wrong) Further investigation determined that many of these tokens are the screen names of a group of Redditors who attempted to count to infinity. The most likely explanation, according to the discoverers, is that these names were in GPT’s tokenization data, but not its training data (maybe they were especially common in the tokenization data because they made thousands of posts with numbers in them, but didn’t make it into the training data because their posts had no content?) - that leaves them existing without content, and GPT tries to round them off to some other “nearby” token (by incomprehensible AI standards of nearbyness). Congrats to the SERI-MATS AI alignment researchers who found all of this; maybe this makes it 0.0001% less likely that the AI which controls the nuclear arsenal in twenty years will have equally inexplicable behavior. 23: More language model news: LLM that understands and can explain images

Inline links: source, contains a few unusual tokens, https://substackcdn.com/image/fetch/$s_!rkvW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdeb0cd39-6942-44c0-85ef-0fac91b362e3_536x147.png, https://substackcdn.com/image/fetch/$s_!QIzo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba1d0783-5826-48a3-9ead-1e803042e96d_622x631.png, Less Wrong, many of these tokens are the screen names of a group of Redditors who attempted to count to infinity, SERI-MATS, LLM that understands and can explain images

(source: Less Wrong) Further investigation determined that many of these tokens are the screen names of a group of Redditors who attempted to count to infinity. The most likely explanation, according to the discoverers, is that these names were in GPT’s tokenization data, but not its training data (maybe they were especially common in the tokenization data because they made thousands of posts with numbers in them, but didn’t make it into the training data because their posts had no content?) - that leaves them existing without content, and GPT tries to round them off to some other “nearby” token (by incomprehensible AI standards of nearbyness). Congrats to the SERI-MATS AI alignment researchers who found all of this; maybe this makes it 0.0001% less likely that the AI which controls the nuclear arsenal in twenty years will have equally inexplicable behavior. 23: More language model news: LLM that understands and can explain images

Inline links: Less Wrong, many of these tokens are the screen names of a group of Redditors who attempted to count to infinity, SERI-MATS, LLM that understands and can explain images

Grading My 2018 Predictions For 2023

February 20, 2023 · Original source

Gary Marcus can still figure out at least three semi-normal (ie not SolidGoldMagikarp style) situations where the most advanced language AIs make ridiculous errors that a human teenager wouldn’t make, more than half the time they’re asked the questions: 30%

Astral Codex Ten

Table of Contents

Atlas

SolidGoldMagikarp

SolidGoldMagikarp

Article

Metadata

Appears In

External Links

Source Context

Backlinks

Astral Codex Ten

Table of Contents

Atlas

SolidGoldMagikarp

SolidGoldMagikarp

Article

Metadata

Appears In

Related Pages

External Links

Source Context

Backlinks