GPT-3

Article

GPT-3 is a recurring brand in the Astral Codex Ten archive, appearing 5 times across 5 issues between June 07, 2022 and June 20, 2023. The archive places it in contexts such as “Of these six prompts that GPT-3 original failed, GPT-3 advanced gets four unambiguously right”; “Now it is true that GPT-3 is genuinely better than GPT-2”; “GPT-3 didn’t hurt anyone”. It most often appears alongside GPT-4, OpenAI, DALL-E.

Metadata

  • Category: Brands
  • Mention count: 5
  • Issue count: 5
  • First seen: June 07, 2022
  • Last seen: June 20, 2023

Appears In

Source Context

Recovered passages from the original issue text. When the raw archive preserved outbound links inside the source passage, they are listed directly under the quote.

June 07, 2022 · Original source
Of these six prompts that GPT-3 original failed, GPT-3 advanced gets four unambiguously right. I give it half-credit for the lawyer prompt; it continued the direction that the story was obviously leaning, understood it was a bad idea, and I would have given it full credit except that it suggested it might sort of be excusable if you were really lucky.
Thanks to OpenAI for giving me access to some of their online tools (by the way, Marcus says they refuse to let him access them and he has to access it through friends, which boggles me). I was able to plug Marcus’ same queries into the latest OpenAI language model (an advanced version of GPT-3). In each case, I used the exact same language, but also checked it with a conceptually similar example to make sure OpenAI didn’t cheat by adding Marcus’ particular example in by hand (they didn’t). Some answers truncated for length:
Of the nine prompts GPT-2 failed, GPT-3 gets between five and seven right, depending on how strict you want to be.
June 10, 2022 · Original source
Now it is true that GPT-3 is genuinely better than GPT-2, and maybe (but maybe not, see footnote 1) true that InstructGPT is genuinely better than GPT-3. I do think that for any given example, the probability of a correct answer has gone up. [Scott] is quite right about that, at least for GPT-2 to GPT-3.
GPT-3 has ~100 billion parameters. It did significantly better than GPT-2, but still failed on some different questions Marcus was able to find.
That is: suppose we created some ideal Platonic benchmark of every reasoning problem you might ask a human. Suppose GPT-2 got 20% of these right, and GPT-3 gets 40% of these right. Might some future GPT-X - not necessarily 4, but 5, or 10, or whatever - get 100% right? I don’t see how Marcus can rule this out: he can’t point to any specific kind of reasoning problem GPTs will never be able to solve. And he agrees that each generation of GPTs can solve more than the one before. So why shouldn’t GPT keep progressing until it gets 100%?
March 01, 2023 · Original source
Sam Altman posing with leading AI safety proponent Eliezer Yudkowsky. Also Grimes for some reason. Planning For AGI And Beyond (“AGI” = “artificial general intelligence”, ie human-level AI) is the latest volley in that campaign. It’s very good, in all the ways ExxonMobil’s hypothetical statement above was very good. If they’re trying to fool people, they’re doing a convincing job! Still, it doesn’t apologize for doing normal AI company stuff in the past, or plan to stop doing normal AI company stuff in the present. It just says that, at some indefinite point when they decide AI is a threat, they’re going to do everything right. This is more believable when OpenAI says it than when ExxonMobil does. There are real arguments for why an AI company might want to switch from moving fast and breaking things at time t to acting all responsible at time t + 1 . Let’s explore the arguments they make in the document, go over the reasons they’re obviously wrong, then look at the more complicated arguments they might be based off of. Why Doomers Think OpenAI Is Bad And Should Have Slowed Research A Long Time Ago OpenAI boosters might object: there’s a disanalogy between the global warming story above and AI capabilities research. Global warming is continuously bad: a temperature increase of 0.5 degrees C is bad, 1.0 degrees is worse, and 1.5 degrees is worse still. AI doesn’t become dangerous until some specific point. GPT-3 didn’t hurt anyone. GPT-4 probably won’t hurt anyone. So why not keep building fun chatbots like these for now, then start worrying later? Doomers counterargue that the fun chatbots burn timeline. That is, suppose you have some timeline for when AI becomes dangerous. For example, last year Metaculus thought human-like AI would arrive in 2040, and superintelligence around 2043. Recent AIs have tried lying to, blackmailing, threatening, and seducing users. AI companies freely admit they can’t really control their AIs, and it seems high-priority to solve that before we get superintelligence. If you think that’s 2043, the people who work on this question (“alignment researchers”) have twenty years to learn to control AI. Then OpenAI poured money into AI, did ground-breaking research, and advanced the state of the art. That meant that AI progress would speed up, and AI would reach the danger level faster. Now Metaculus expects superintelligence in 2031, not 2043 (although this seems kind of like an over-update), which gives alignment researchers eight years, not twenty. So the faster companies advance AI research - even by creating fun chatbots that aren’t dangerous themselves - the harder it is for alignment researchers to solve their part of the problem in time. This is why some AI doomers think of OpenAI as an Exxon-Mobil style villain, even though they’ve promised to change course before the danger period. Imagine an environmentalist group working on research and regulatory changes that would have solar power ready to go in 2045. Then ExxonMobil invents a new kind of super-oil that ensures that, nope, all major cities will be underwater by 2031 now. No matter how nice a statement they put out, you’d probably be pretty mad! Why OpenAI Thinks Their Research Is Good Now, But Might Be Bad Later OpenAI understands the argument against burning timeline. But they counterargue that having the AIs speeds up alignment research and all other forms of social adjustment to AI. If we want to prepare for superintelligence - whether solving the technical challenge of alignment, or solving the political challenges of unemployment, misinformation, etc - we can do this better when everything is happening gradually and we’ve got concrete AIs to think about: We believe we have to continuously learn and adapt by deploying less powerful versions of the technology in order to minimize “one shot to get it right” scenarios […] As we create successively more powerful systems, we want to deploy them and gain experience with operating them in the real world. We believe this is the best way to carefully steward AGI into existence—a gradual transition to a world with AGI is better than a sudden one. We expect powerful AI to make the rate of progress in the world much faster, and we think it’s better to adjust to this incrementally. A gradual transition gives people, policymakers, and institutions time to understand what’s happening, personally experience the benefits and downsides of these systems, adapt our economy, and to put regulation in place. It also allows for society and AI to co-evolve, and for people collectively to figure out what they want while the stakes are relatively low. You might notice that, as written, this argument doesn’t support full-speed-ahead AI research. If you really wanted this kind of gradual release that lets society adjust to less powerful AI, you would do something like this: Release AI #1
March 14, 2023 · Original source
5: Will takeoff be slow vs. fast? So far we’ve had brisk but still gradual progress in AI; GPT-3 is better than GPT-2, and GPT-4 will probably be better still. Every few years we get a new model which is better than previous models by some predictable amount.
June 20, 2023 · Original source
GPT-4 is better than GPT-3, but maybe not the same amount of better that an AI that did 100% of human jobs would have to be over an AI that did 20% of human jobs. That suggests the gap is bigger than the 2 OOMs that separate GPT-4 from GPT-3.