Biological Anchors: A Trick That Might Or Might Not Work
Introduction I’ve been trying to review and summarize Eliezer Yudkowksy’s recent dialogues on AI safety. Previously in sequence: Yudkowsky Contra Ngo On Agents . Now we’re up to Yudkowsky contra Cotra on biological anchors, but before we get there we need to figure out what Cotra’s talking about and what’s going on.
Metadata
- Published: February 23, 2022
- Source: https://www.astralcodexten.com/p/biological-anchors-a-trick-that-might
- Document ID:
2022-02-23_biological-anchors-a-trick-that-might_full
Category Map
Concepts
- AGI (25 mentions)
- GPT-3 (19 mentions)
- Moore’s Law (6 mentions)
- Go (5 mentions)
- AlphaGo (4 mentions)
- Deep Blue (4 mentions)
- Bio Anchors (3 mentions)
- evolution (3 mentions)
- GOFAI (2 mentions)
- Moore’s Law (2 mentions)
- Nature (2 mentions)
- Starcraft (2 mentions)
- AIXI (1 mentions)
- Ajeya’s Evolutionary Anchor (1 mentions)
- alpha-beta pruning (1 mentions)
- biological anchors (1 mentions)
- CNN (1 mentions)
- convnets (1 mentions)
- Deep learning (1 mentions)
- deep-learning revolution (1 mentions)
- Dennard scaling (1 mentions)
- Eliza-style chatbots (1 mentions)
- FLOPs (1 mentions)
- GPGPU computing (1 mentions)
- GPT3 (1 mentions)
- Japanese canopy plant (1 mentions)
- LSTM (1 mentions)
- Neural Net With Long Horizon (1 mentions)
- Neural networks (1 mentions)
- Platt’s Law (1 mentions)
- Platt’s Law (1 mentions)
- Platt’s Law Of AI Forecasting (1 mentions)
- Pythagorean Theorem (1 mentions)
- Restricted Boltzmann machines (1 mentions)
- RNN (1 mentions)
- SVMs (1 mentions)
- transformative AI (1 mentions)
- Transformer (1 mentions)
Events
- Manhattan Project (2 mentions)
- AlphaStar (1 mentions)
- Netflix Prize (1 mentions)
- SF STC (1 mentions)
- WCCC (1 mentions)
Films
- Starcraft (1 mentions)
Game
- StarCraft (1 mentions)
Organizations
- Metaculus (86 mentions)
- OpenAI (83 mentions)
- Google (82 mentions)
- DeepMind (21 mentions)
- MIRI (14 mentions)
- Open Philanthropy Project (12 mentions)
- AI Impacts (5 mentions)
- AlphaGo (5 mentions)
- GPT-4 (5 mentions)
- Open Phil (5 mentions)
- OpenPhil (5 mentions)
- GPT-3 (3 mentions)
People
- Eliezer Yudkowsky (58 mentions)
- Bryan Caplan (32 mentions)
- Eliezer (27 mentions)
- Paul Christiano (14 mentions)
- Paul (11 mentions)
- Ajeya Cotra (9 mentions)
- Katja Grace (7 mentions)
- Holden Karnofsky (6 mentions)
- Matthew Barnett (6 mentions)
- Ray Kurzweil (5 mentions)
- Ajeya (3 mentions)
- Carl (3 mentions)
- Mark Xu (3 mentions)
- Carl Shulman (2 mentions)
- Hans Moravec (2 mentions)
- Hinton (2 mentions)
- Joe Carlsmith (2 mentions)
- Vanessa Kosoy (2 mentions)
- Ajeya et al (1 mentions)
- Besiroglu (1 mentions)
- Besiroglu’s data (1 mentions)
- Charles Platt (1 mentions)
- Charlie Steiner (1 mentions)
- Grace et al. (1 mentions)
- Hernandez & Brown (1 mentions)
- Hernandez&Brown estimate (1 mentions)
- Hippke (1 mentions)
- Holden (1 mentions)
- Humbali (1 mentions)
- Moravec (1 mentions)
- Neil Thompson (1 mentions)
- OpenPhil CEO Holden Karnofsky (1 mentions)
- Rohin Shah (1 mentions)
- Salakhutdinov (1 mentions)
- Vanessa (1 mentions)
- Vernor Vinge (1 mentions)
Places
- Australia (53 mentions)
Publications
- Less Wrong (40 mentions)
- Yudkowsky contra Cotra on biological anchors (2 mentions)
- Ajeya’s report (1 mentions)
- Alignment Newsletter (1 mentions)
- Biology-Inspired AGI Timelines: The Trick That Never Works (1 mentions)
- Biology-Inspired AI Timelines: The Trick That Never Works (1 mentions)
- Colab notebook (1 mentions)
- FLOPS Alone Turn The Wheel Of History (1 mentions)
- Google spreadsheet (1 mentions)
- GPT-2 (1 mentions)
- Grace et al expert survey (1 mentions)
- Measuring The Algorithmic Efficiency Of Neural Networks (1 mentions)
- Yudkowsky Contra Ngo On Agents (1 mentions)
Software
- Fritz (1 mentions)
- Modern Stockfish (1 mentions)
- NNUE engines (1 mentions)
- SF8 (1 mentions)
Venues
- Great Eastern (1 mentions)
Full Primary Source Text
Introduction I’ve been trying to review and summarize Eliezer Yudkowksy’s recent dialogues on AI safety. Previously in sequence: Yudkowsky Contra Ngo On Agents . Now we’re up to Yudkowsky contra Cotra on biological anchors, but before we get there we need to figure out what Cotra’s talking about and what’s going on. The Open Philanthropy Project (“Open Phil”) is a big effective altruist foundation interested in funding AI safety. It’s got $20 billion, probably the majority of money in the field, so its decisions matter a lot and it’s very invested in getting things right. In 2020, it asked senior
researcher Ajeya Cotra to produce a report on when human-level AI would arrive. It says the resulting document is “informal” - but it’s 169 pages long and likely to affect millions of dollars in funding, which some might describe as making it kind of formal. The report finds a 10% chance of “transformative AI” by 2031, a 50% chance by 2052, and an almost 80% chance by 2100. Eliezer rejects their methodology and expects AI earlier (he doesn’t offer many numbers, but here he gives Bryan Caplan 50-50 odds on 2030, albeit not totally seriously ). He made the case
in his own very long essay, Biology-Inspired AGI Timelines: The Trick That Never Works , sparking a bunch of arguments and counterarguments and even more long essays. There’s a small cottage industry of summarizing the report already, eg OpenPhil CEO Holden Karnofsky’s article and Alignment Newsletter editor Rohin Shah’s comment . I’ve drawn from both for my much-inferior attempt. Part I: The Cotra Report Ajeya Cotra is a senior research analyst at OpenPhil. She’s assisted by her fiancee Paul Christiano (compsci PhD, OpenAI veteran, runs an AI alignment nonprofit) and to a lesser degree by other leading lights. Although not
everyone involved has formal ML training, if you care a lot about whether efforts are “establishment” or “contrarian”, this one is probably more establishment. The report asks when will we first get “transformative AI” (ie AI which produces a transition as impressive as the Industrial Revolution; probably this will require it to be about as smart as humans). Its methodology is: 1. Figure out how much inferential computation the human brain does. 2. Try to figure out how much training computation it would take, right now, to get a neural net that does the same amount of inferential computation. Get
some mind-bogglingly large number. 3. Adjust for “algorithmic progress”, ie maybe in the future neural nets will be better at using computational resources efficiently. Get some number which, realistically, is still mind-bogglingly large. 4. Probably if you wanted that mind-bogglingly large amount of computation, it would take some mind-bogglingly large amount of money. But computation is getting cheaper every year. Also, the economy is growing every year. Also, the share of the economy that goes to investments in AI companies is growing every year. So at some point, some AI company will actually be able to afford that mind-boggingly-large amount
of money, deploy the mind-bogglingly large amount of computation, and train the AI that has the same inferential computation as the human brain. 5. Figure out what year that is. Does this encode too many questionable assumptions? For example, might AGI come from an ecosystem of interacting projects (eg how the Industrial Revolution came from an ecosystem of interacting technologies) such that nobody has to train an entire brain-sized AI in one run? Maybe - in fact, Ajeya thinks the Industrial Revolution scenario might be more likely than the single-run scenario. But she finds the single-run scenario as a useful
upper bound (later she mentions other reasons to try it as a lower bound, and compromises by treating it as a central estimate) and still thinks it’s worth figuring out how long it will take. So let’s go through the steps one by one: How Much Computation Does The Human Brain Do? Step one - figuring out how much computation the human brain does - is a daunting task. A successful solution would look like a number of FLOP/S (floating point operations per second), a basic unit of computation in digital computers. Luckily for Ajeya and for us, another OpenPhil
analyst, Joe Carlsmith, finished a report on this a few months prior. It concluded the brain probably uses 10^13 - 10^17 FLOP/S. Why? Partly because this was the number given by most experts. But also, there are about 10^15 synapses in the brain, each one spikes about once per second, and a synaptic spike probably does about one FLOP of computation. (I’m not sure if he’s taking into account the recent research suggesting that computation sometimes happens within dendrites - see section 2.1.1.2.2 of his report for complications and why he feels okay ignoring them - but realistically there are
lots of order-of-magnitude-sized gray areas here, and he gives a sufficiently broad range that as long as the unknown unknowns aren’t all in the same direction it should be fine.) So a human-level AI would also need to do 10^15 floating point operations per second? Unclear. Computers can run on more or less efficient algorithms; neural nets might use their computation more or less effectively than the brain. You might think it would be more efficient, since human designers can do better than the blind chance of evolution. Or you might think it would be less efficient, since many biological
processes are still far beyond human technology. Or you might do what OpenPhil did and just look at a bunch of examples of evolved vs. designed systems and see which are generally better: Source: This document by Paul Christiano. Ajeya combines this with another metric where they see how existing AI compares to animals with apparently similar computational capacity; for example, she says that DeepMind’s Starcraft engine has about as much inferential compute as a honeybee and seems about equally subjectively impressive. I have no idea what this means. Impressive at what? Winning multiplayer online games? Stinging people? In any
case, they decide to penalize AI by one order of magnitude compared to Nature, so a human-level AI would need to do 10^16 floating point operations per second. How Much Compute Would It Take To Train A Model That Does 10^16 Floating Point Operations Per Second? So an AI could potentially equal the human brain with 10^16 FLOP/S. Good news! There’s a supercomputer in Japan that can do 10^17 FLOP/S! It looks like this ( source ) So why don’t we have AI yet? Why don’t we have ten AIs? In the modern paradigm of machine learning, it takes very
big computers to train relatively small end-product AIs. If you tried to train GPT-3 on the same kind of medium-sized computers you run it on, it would take between tens and hundreds of years. Instead, you train GPT-3 on giant supercomputers like the ones above, get results in a few months, then run it on medium-sized computers, maybe ~10x better than the average desktop. But our hypothetical future human-level AI is 10^16 FLOP/S in inference mode. It needs to run on a giant supercomputer like the one in the picture. Nothing we have now could even begin to train it.
There’s no direct and obvious way to convert inference requirements to training requirements. Ajeya tries assuming that each parameter will contribute about 10 FLOPs, which would mean the model would have about 10^15 parameters (GPT-3 has about 10^11 parameters). Finally, she uses some empirical scaling laws derived from looking at past machine learning projects to estimate that training 10^15 parameters would require H10^30 FLOPs, where H represents the model’s “horizon”. If I understand this correctly, “horizon” is a reinforcement learning concept: how long does it take to learn how much reward you got for something? If you’re playing a slot
machine, the answer is one second. If you’re starting a company, the answer might be ten years. So what horizon do you need for human level AI? Who knows? It probably depends on what human-level task you want the AI to do, plus how well an AI can learn to do that task from things less complex than the entire task. If writing a good book is mostly about learning to write good sentence and then stringing them together, a book-writing AI can get away with a short horizon. If nothing short of writing an entire book and then evaluating
it to see whether it is good or bad can possibly teach you book-writing, the AI will need a long time horizon. Ajeya doesn’t claim to have a great answer for this, and considers three models: horizons of a few minutes, a few hours, and a few years. Each step up adds another three orders of magnitude, so she ends up with three estimates of 10^30, 10^33, and 10^36 FLOPs. (for reference, the lowest training estimate - 10^30 - would take the supercomputer pictured above 300,000 years to complete; the highest, 300 billion.) Or What If We Ignore All Of
That And Do Something Else? This is piling a lot of assumptions atop each other, so Ajeya tries three other methods of figuring out how hard this training task is. Humans seem to be human-level AIs. How much training do we need? You can analogize our childhood to an AI’s training period. We receive a stream of sense-data. We start out flailing kind of randomly. Some of what we do gets rewarded. Some of what we do gets punished. Eventually our behavior becomes more sophisticated. We subject our new behavior to reward or punishment, fine-tune it further. Rent asks us:
how do you measure the life of a woman or man? It answers: “in daylights, in sunsets, in midnights, in cups of coffee; in inches, in miles, in laughter, in strife.” But you can also measure in floating point operations, in which case the answer is about 10^24. This is actually trivial: multiply the 10^15 FLOP/S of the human brain by the ~10^9 seconds of childhood and adolescence. This new estimate of 10^24 is much lower than our neural net estimate of 10^30 - 10^36 above. In fact, it’s only a hair above the amount it took to train GPT-3!
If human-level AI was this easy, we should have hit it by accident sometime in the process of making a GPT-4 prototype. Since OpenAI hasn’t mentioned this, probably it’s harder than this and we’re missing something. Probably we’re missing that humans aren’t blank slates. We don’t start at zero and then only use our childhood to train us further. The very structure of our brain encodes certain assumptions about what kinds of data we should be looking out for and how we should use it. Our training data isn’t just what we observed during childhood, it’s everything that any of
our ancestors observed during evolution. How many floating-point operations is the evolutionary process? Ajeya estimates 10^41. I can’t believe I’m writing this. I can’t believe someone actually estimated the number of floating point operations involved in jellyfish rising out of the primordial ooze and eventually becoming fish and lizards and mammals and so on all the way to the Ascent of Man. Still, the idea is simple. You estimate how long animals with neurons have been around for (10^16 seconds), total number of animals at any given second (10^20) times average number of FLOPS per animal (10^5) and you can
read more here but it comes out to 10^41 FLOs. I would not call this an exact estimate - for one thing, it assumes that all animals are nematodes, on the grounds that non-nematode animals are basically a rounding error in the grand scheme of things. But it does justify this bizarre assumption, and I don’t feel inclined to split hairs here - surely the total amount of computation performed by evolution is irrelevant except as an extreme upper bound? Surely the part where Australia got all those weird marsupials wasn’t strictly necessary for the human brain to have human-level
intelligence? One more weird human training data estimate attempt: what about the genome? If in some sense a bit of information in the genome is a “parameter”, how many parameters does that suggest humans have, and how does it affect training time? Ajeya calculates that the genome has about 7.5x10^8 parameters (compared to 10^15 parameters in our neural net calculation, and 10^11 for GPT-3). So we can… Okay, I’ve got to admit, this doesn’t have quite the same “huh?!” factor as trying to calculate the number of FLOs in evolution, but it is in a lot of ways even crazier.
The Japanese canopy plant has a genome fifty times larger than ours, which suggests that genome size doesn’t correspond very well to organism awesomeness. Also, most of the genome is coding for weird proteins that stabilize the shape of your kidney tubule or something, why should this matter for intelligence? The Japanese canopy plant. I think it is very pretty, but probably low prettiness per megabyte of DNA . I think Ajeya would answer that she’s debating orders of magnitude here, and each of these weird things costs only a few OOMs and probably they all even out. That still
leaves the question of why she thinks this approach is interesting at all, to which she answers that: The motivating intuition is that evolution performed a search over a space of small, compact genomes which coded for large brains rather than directly searching over the much larger space of all possible large brains, and human researchers may be able to compete with evolution on this axis. So maybe instead of having to figure out how to generate a brain per se, you figure out how to generate some short(er) program that can output a brain? But this would be very
different from how ML works now. Also, you need to give each short program the chance to unfold into a brain before you can evaluate it, which evolution has time for but we probably don’t. Ajeya sort of mentions these problems and counters with an argument that maybe you could think of the genome as a reinforcement learner with a long horizon. I don’t quite follow this but it sounds like the sort of thing that almost might make sense. Anyway, when you apply the scaling laws to a 7.510^8 parameter genome and penalize it for a long horizon, you
get about 10^33 FLOPs, which is weirdly similar to some of the other estimates. So now we have six different training cost estimates. First, neural nets with short, medium, and long horizons, which are 10^30, 10^33, and 10^36 FLOPs, respectively. Next, the amount of training data in a human lifetime - 10^24 FLOs - and in all of evolutionary history - 10^41 FLOPs. And finally, this weird genome thing, which is 10^33 FLOPs. An optimist might say “Well, our lowest estimate is 10^24 FLOPs, our highest is 10^41 FLOPs, those sound like kind of similar numbers, at least there’s no
“5 FLOPs” or “10^9999 FLOPs” in there. A pessimist might say “The difference between 10^24 and 10^41 is seventeen orders of magnitude, ie a factor of 100,000,000,000,000,000 times. This barely constrains our expectations at all!” Before we decide who to trust, let’s remember that we’re still only at Step 2 of our eight step Methodology, and continue. How Do We Adjust For Algorithmic Progress? So today, in 2022 (or in 2020 when this was written, or whenever), assume it would take about 10^33 FLOs to train a human-level AI. But technology constantly advances. Maybe we’ll discover ways to train AIs
faster, or run AIs more efficiently, or something like that. How does that factor into our estimate? Ajeya draws on Hernandez & Brown’s Measuring The Algorithmic Efficiency Of Neural Networks . They look at how many FLOPs it took to train various image recognition AIs to an equivalent level of performance between 2012 and 2019, and find that over those seven years it decreased by a factor of 44x, ie training efficiency doubles every sixteen months! Ajeya assumes a doubling time slightly longer than that, because it’s easier to make progress in simple well-understood fields like image recognition than in
the novel task of human-level AI. She chooses a doubling time of “merely” 2 - 3 years. If training efficiency doubles every 2-3 years, it would dectuple in about 10 years. So although it might take 10^33 FLOPs to train a human level AI today, in ten years or so it may take only 10^32, in twenty years 10^31, and so on. When Will Anyone Have Enough Computational Resources To Train A Human-Level AI? In 2020, AI researchers could buy computational resources at about $1 for 10^17 FLOPs. That means the 10^33 FLOPs you’d need to train a human-level AI
would cost $10^16, ie ten quadrillion dollars. This is about twenty times more money than exists in the entire world. But compute costs fall quickly. Some formulations of Moore’s Law suggest it halves every eighteen months. These no longer seem to hold exactly, but it does seem to be halving maybe once every 2.5 years. The exact number is kind of controversial: Ajeya admits it’s been more like once every 3-4 years lately, but she heard good things about some upcoming chips and predicted it might revert back to the longer-term faster trend (it’s been two years now, some new
chips have come out, and this prediction is looking pretty good). So as time goes on, algorithmic progress will cut the cost of training (in FLOPs), and hardware progress will also cut the cost of FLOPs (in dollars). So training will become gradually more affordable as time goes on. Once it reaches a cost somebody is willing to pay, they’ll buy human-level AI, and then that will be the year human-level AI happens. What is the cost that somebody (company? government? billionaire?) is willing to pay for human-level AI? The most expensive AI training in history was AlphaStar, a DeepMind
project that spent over $1 million to train an AI to play StarCraft ( in their defense, it won). But people have been pouring more and more money into AI lately: Source here . This is about compute rather than cost, but most of the increase seen here has been companies willing to pay for more compute over time, rather than algorithmic or hardware progress. The StarCraft AI was kind of a vanity project, or science for science’s sake, or whatever you want to call it. But AI is starting to become profitable, and human-level AI would be very profitable.
Who knows how much companies will be willing to pay in the future? Ajeya extrapolates the line on the graph forward to 2025 and gets $1 billion. This is starting to sound kind of absurd - the entire company OpenAI was founded with $1 billion in venture capital, it seems like a lot to expect them to spend more than $1 billion on a single training run. So Ajeya backs off from this after 2025 and predicts a “two year doubling time”. This is not much of a concession. It still means that in 2040 someone might be spending $100
billion to train one AI. Is this at all plausible? At the height of the Manhattan Project, the US was investing about 0.5% of its GDP into the effort; a similar investment today would be worth $100 billion. And we’re about twice as rich as 2000, so 2040 might be twice as rich as we are. At that point, $100 billion for training an AI is within reach of Google and maybe a few individual billionaires (though it would still require most or all of their fortune). Ajeya creates a complicated function to assess how much money people will be
willing to pay on giant AI projects per year. This looks like an upward-sloping curve. The line representing the likely cost of training a human-level AI looks like a downward sloping curve. At some point, those two curves meet, representing when human-level AI will first be trained. So When Will We Get Human-Level AI? The report gives a long distribution of dates based on weights assigned to the six different models, each of which has really wide confidence intervals and options for adjusting the mean and variance based on your assumptions. But the median of all of that is 10%
chance by 2031, 50% chance by 2052, and almost 80% chance by 2100. Ajeya takes her six models and decides to weigh them like so, based on how plausible she thinks each one is: 20% neural net, short horizon 30% neural net, medium horizon 15% neural net, long horizon 5% human lifetime as training data 10% evolutionary history as training data 10% genome as parameter number She ends up with this: How Sensitive Is This To Changes In Assumptions? She very helpfully gives us a Colab notebook and Google spreadsheet to play around with. The notebook lets you change some
of the more detailed parameters of the individual models, and the spreadsheet lets you change the big picture. I leave the notebook to people more dedicated to forecasting than I am, and will talk about the spreadsheet here. If you’re following along at home, the default spreadsheet won’t reflect Ajeya’s findings until you fill in the table in the bottom left like so: Great. Now that we’ve got that, let’s try changing some stuff. I like the human childhood training data argument (Lifetime Anchor) more than Ajeya does, and I like the size-of-the-genome argument less. I’m going to change the
weights to 20-20-0-20-20-20. Also, Ajeya thinks that someone might be willing to spend 1% of national GDP on training AIs, but that sounds really high to me, so I’m going to down to 0.1%. Also, Ajeya’s estimate of 3% GDP growth sounds high for the sort of industrialized nations who might do AI research, I’m going to lower it to 2%. Since I’m feeling mistrustful today, let’s use the Hernandez&Brown estimate for compute halving (1.5 years) in place of Ajeya’s ad hoc adjustments. And let’s use the current compute halving time (3.5 years) instead of Ajeya’s overly rosy version (2.5
years). All these changes… …don’t really do much. The median goes from 2052 to about 2065. Four of the models give results between 2030 and 2070. The last two, Neural Net With Long Horizon and Evolution, suggest probably no AI this century (although Neural Net With Long Horizon does think there’s a 40% chance by 2100). Ajeya doesn’t really like either of these models and they’re not heavily weighted in her main result. Does The Truth Point To Itself? Back up a second. Here’s something that makes me kind of nervous. Most of Ajeya’s numbers are kind of made up,
with several order-of-magnitude error bars and simplifying assumptions like “all animals are nematodes”. For a single parameter, we get estimates spanning seventeen different orders of magnitude: the upper bound is one hundred quadrillion times the lower bound. And yet four of the six models, including two genuinely exotic ones, manage to get dates within twenty years of 2050. And 2050 is also the date everyone else focuses on. Here’s the prediction-market-like site Metaculus : Their distribution looks a lot like Ajeya’s, and even has the same median, 2052 (though forecasters could have read Ajeya’s report). Katja Grace et al surveyed
352 AI experts , and they gave a median estimate of 2062 for an AI that could “outperform humans at all tasks” (though with many caveats and high sensitivity to question framing). This was before Ajeya’s report, so they definitely didn’t read it. So lots of Ajeya’s different methods and lots of other people presumably using different methodologies or no methodology at all, all converge on this same idea of 2050 give or take a decade or two. An optimist might say “The truth points to itself! There are 371 known proofs of the Pythagorean Theorem, and they all end
up in the same place. That’s because no matter what methodology you use, if you use it well enough you get to the correct answer.” A pessimist might be more suspicious; we’ll return to this part later. FLOPS Alone Turn The Wheel Of History One more question: what if this is all bullshit? What if it’s an utterly useless total garbage steaming pile of grade A crap? Imagine a scientist in Victorian Britain, speculating on when humankind might invent ships that travel through space. He finds a natural anchor: the moon travels through space! He can observe things about the
moon: for example, it is 220 miles in diameter (give or take an order of magnitude). So when humankind invents ships that are 220 miles in diameter, they can travel through space! Ships have certainly grown in size tremendously, from primitive kayaks to Roman triremes to Spanish galleons to the great ocean liners of the (Victorian) present. The AI forecasting organization AI Impacts actually has a whole report on historical ship size trends to prove an unrelated point about technological progress, so I didn’t even have to make this graph up. Suppose our Victorian scientist lived in 1858, right when
the Great Eastern was launched. The trend line for ship size crossed 100m around 1843, and 200m in 1858, so doubling time is 15 years - but perhaps they notice this is going to be an outlier, so let’s round up a bit and say 18 years. The (one order of magnitude off estimate for the size of the) Moon is 350,000m, so you’d need ships to scale up by 350,000/200 = 1,750x before they’re as big as the Moon. That’s about 10.8 doublings, and a doubling time is 18 years, so we’ll get spaceships in … 2052
exactly. (fudging numbers to land where you want is actually fun and easy) SS Great Eastern, the extreme outlier large steamship from 1858. This has become sort of a mascot for quantitative technological progress forecasters. What is this scientist’s error? The big one is thinking that spaceship progress depends on some easily-measured quantity (size) instead of on fundamental advances (eg figuring out how rockets work). You can make the same accusation against Ajeya et al: you can have all the FLOPs in the world, but if you don’t understand how to make a machine think, your AI will be, well,
a flop. Ajeya discusses this a bit on page 143 of her report. There is some sense in which FLOPs and knowing-what-you’re-doing trade of against each other. If you have literally no idea what you’re doing, you can sort of kind of re-run evolution until it comes up with something that looks good. If things are somehow even worse than that , you could always run AIXI , a hypothetical AI design guaranteed to get excellent results as long as you have infinite computation. You could run a Go engine by searching the entire branching tree structure of Go -
you shouldn’t , and it would take a zillion times more compute than exists in the entire world, but you could . So in some sense what you’re doing, when you’re figuring out what you’re doing, is coming up with ways to do already-possible things more efficiently. But that’s just algorithmic progress, which Ajeya has already baked into her model. (our Victorian scientist: “As a reductio ad absurdum , you could always stand the ship on its end, and then climb up it to reach space. We’re just trying to make ships that are more efficient than that.”) Part II:
Biology-Inspired AI Timelines: The Trick That Never Works Eliezer Yudkowsky presents a more subtle version of these kinds of objection in an essay called Biology-Inspired AI Timelines: The Trick That Never Works , published December 2021. Ajeya’s report is a 169-page collection of equations, graphs, and modeling assumptions. Yudkowsky’s rebuttal is a fictional dialogue between himself, younger versions of himself, famous AI scientists, and other bit players. At one point, a character called “Humbali” shows up begging Yudkowsky to be more humble, and Yudkowsky defeats him with devastating counterarguments. Still, he did found the field, so I guess everyone has
to listen to him. He starts: in 1988, famous AI scientist Hans Moravec predicted human-level AI by 2010. He was using the same methodology as Ajeya: extrapolate how quickly processing power would grow (in FLOP/S), and see when it would match some estimate of the human brain. Moravec got the processing power almost exactly right (it hit his 2010 projection in 2008) and his human brain estimate pretty close (he says 10^13 FLOP/S, Ajeya says 10^15, this 2 OOM difference only delays things a few years), yet there was not human-level AI in 2010. What happened? Ajeya’s answer could be:
Moravec didn’t realize that, in the modern ML paradigm, any given size of program requires a much bigger program to train. Ajeya, who has a 35-year advantage on Moravec, estimates approximately the same power for the finished program (10^16 vs. 10^13 FLOP/S) but says that training the 10^16 FLOP/S program will require 10^33ish FLOPs. Eliezer agrees as far as it goes, but says this points to a much deeper failure mode, which was that Moravec had no idea what he was doing. He was assuming processing power of human brain = processing power of computer necessary for AGI. Why? The
human brain consumes around 20 watts of power. Can we thereby conclude that an AGI should consume around 20 watts of power, and that, when technology advances to the point of being able to supply around 20 watts of power to computers, we’ll get AGI? […] You say that AIs consume energy in a very different way from brains? Well, they’ll also consume computations in a very different way from brains! The only difference between these two cases is that you know something about how humans eat food and break it down in their stomachs and convert it into ATP
that gets consumed by neurons to pump ions back out of dendrites and axons, while computer chips consume electricity whose flow gets interrupted by transistors to transmit information. Since you know anything whatsoever about how AGIs and humans consume energy, you can see that the consumption is so vastly different as to obviate all comparisons entirely. You are ignorant of how the brain consumes computation, you are ignorant of how the first AGIs built would consume computation, but “an unknown key does not open an unknown lock” and these two ignorant distributions should not assert much internal correlation between them.
Cars don’t move by contracting their leg muscles and planes don’t fly by flapping their wings like birds. Telescopes do form images the same way as the lenses in our eyes, but differ by so many orders of magnitude in every important way that they defy comparison. Why should AI be different? You have to use some specific algorithm when you’re creating AI; why should we expect it to be anywhere near the same efficiency as the ones Nature uses in our brains? The same is true for arguments from evolution, eg Ajeya’s Evolutionary Anchor, ie “it took evolution 10^43
FLOPs of computation to evolve the human brain so maybe that will be the training cost”. AI scientists sitting in labs trying to figure things out, and nematodes getting eaten by other nematodes, are such different methods for designing things that it’s crazy to use one as an estimate for the other. Algorithmic Progress vs. Algorithmic Paradigm Shifts This post is a dialogue, so (Eliezer’s hypothetical model of) OpenPhil gets a chance to respond. They object: this is why we put a term for algorithmic progress in our model. The model isn’t very sensitive to changes in that term. If
you want you can set it to some kind of crazy high value and see what happens, but you can’t say we didn’t consider it. OpenPhil: We did already consider that and try to take it into account: our model already includes a parameter for how algorithmic progress reduces hardware requirements. It’s not easy to graph as exactly as Moore’s Law, as you say, but our best-guess estimate is that compute costs halve every 2-3 years […] Eliezer: The makers of AGI aren’t going to be doing 10,000,000,000,000 rounds of gradient descent, on entire brain-sized 300,000,000,000,000-parameter models, algorithmically faster than
today. They’re going to get to AGI via some route that you don’t know how to take, at least if it happens in 2040. If it happens in 2025, it may be via a route that some modern researchers do know how to take, but in this case, of course, your model was also wrong. They’re not going to be taking your default-imagined approach algorithmically faster, they’re going to be taking an algorithmically different approach that eats computing power in a different way than you imagine it being consumed. OpenPhil: Shouldn’t that just be folded into our estimate of how
the computation required to accomplish a fixed task decreases by half every 2-3 years due to better algorithms? Eliezer: Backtesting this viewpoint on the previous history of computer science, it seems to me to assert that it should be possible to: - Train a pre-Transformer RNN/CNN-based model, not using any other techniques invented after 2017, to GPT-2 levels of performance, using only around 2x as much compute as GPT-2; - Play pro-level Go using 8-16 times as much computing power as AlphaGo, but only 2006 levels of technology. For reference, recall that in 2006, Hinton and Salakhutdinov were just starting
to publish that, by training multiple layers of Restricted Boltzmann machines and then unrolling them into a “deep” neural network, you could get an initialization for the network weights that would avoid the problem of vanishing and exploding gradients and activations. At least so long as you didn’t try to stack too many layers, like a dozen layers or something ridiculous like that. This being the point that kicked off the entire deep-learning revolution. Your model apparently suggests that we have gotten around 50 times more efficient at turning computation into intelligence since that time; so, we should be able
to replicate any modern feat of deep learning performed in 2021, using techniques from before deep learning and around fifty times as much computing power. OpenPhil: No, that’s totally not what our viewpoint says when you backfit it to past reality. Our model does a great job of retrodicting past reality. Eliezer: How so? OpenPhil:
Backlinks
- AGI
- AI Impacts
- Ajeya
- Ajeya Cotra
- AlphaGo
- AlphaGo
- Astral Codex Ten
- Australia
- Bio Anchors
- Bryan Caplan
- Carl
- Carl Shulman
- Concepts: A
- Concepts: B
- Concepts: C
- Concepts: D
- Concepts: E
- Concepts: F
- Concepts: G
- Concepts: J
- Concepts: L
- Concepts: M
- Concepts: N
- Concepts: P
- Concepts: R
- Concepts: S
- Concepts: T
- Deep Blue
- DeepMind
- Eliezer
- Eliezer Yudkowsky
- Events: A
- Events: M
- Events: N
- Events: S
- Events: W
- evolution
- Films
- Game
- Go
- GOFAI
- GPT-3
- GPT-3
- GPT-4
- Hans Moravec
- Hinton
- Holden Karnofsky
- Joe Carlsmith
- Katja Grace
- Less Wrong
- Manhattan Project
- Mark Xu
- Matthew Barnett
- Metaculus
- MIRI
- Moore’s Law
- Moore’s Law
- Nature
- Open Phil
- Open Philanthropy Project
- OpenAI
- OpenPhil
- Organizations: A
- Organizations: D
- Organizations: G
- Organizations: M
- Organizations: O
- Paul
- Paul Christiano
- People: A
- People: B
- People: C
- People: E
- People: G
- People: H
- People: J
- People: K
- People: M
- People: N
- People: O
- People: P
- People: R
- People: S
- People: V
- Places: A
- Publications: A
- Publications: B
- Publications: C
- Publications: F
- Publications: G
- Publications: L
- Publications: M
- Publications: Y
- Ray Kurzweil
- Software
- Starcraft
- Timeline of Issues
- Vanessa Kosoy
- Venues: G
- Yudkowsky contra Cotra on biological anchors