Software

Entities classified as software within this archive.

Reference Index

Use the title to open the reference entry. Use the caret to expand a compact inline dossier with source context, issue trail, related pages, and outbound links.

AlphaGo1 mentions across 1 issues

AlphaGo is a recurring software in the Astral Codex Ten archive, appearing 1 times across 1 issues between August 06, 2021 and August 06, 2021. The archive places it in contexts such as "Existing AIs like AlphaGo or GPT seem to be basically a blob". It most often appears alongside AGI, AI, AI Impacts.

Reference entry: AlphaGo
Mention count: 1
Issue count: 1
First seen: August 06, 2021
Last seen: August 06, 2021

Outbound links

Issue trail

Highlights From The Comments On Acemoglu And AI August 06, 2021

AGI 1 shared issues
AI 1 shared issues
AI Impacts 1 shared issues
AI risk 1 shared issues
algorithmic bias 1 shared issues

Source context

Highlights From The Comments On Acemoglu And AI

August 06, 2021 · Original source

My personal estimates are more like 75% chance, 25% chance, and a distribution that peaks about 20 years later than this one. I think the Metaculus position is consistent with all of “this probably won’t happen”, “THIS IS SUPER-TERRIFYING”, “this is most likely far away”, and “BUT FOR ALL WE KNOW IT COULD BE TOMORROW!” I realize this is an annoying way for things to be. ————————————————— CraigMichael writes: >But all the AI regulation in the world won’t help us unless we humans resist the urge to spread misinformation to maximize clicks. Was with you up to this point. There are several solutions to this other than willpower (resisting the urge). The basic idea - change incentives so that while spreading misinformation is possible but substantially less desirable/lucrative than other options for online behaviors. This isn’t so hard to imagine. Say there’s a lot of incentives to earn money online doing creative or useful things. Like Mechanical Turk, but less route behavior and more performing a service or matching needs. Like I wish I had a help desk for English questions where the answers were good and not people posturing to look good to other people on the English Stack Exchange, for example. I would pay them per call or per minute or whatever. Totally unexplored market AFAIK because technology hasn’t been developed yet. Another idea - Give people more options to pay at an article-level for information that’s useful to them or to have related questions answered or something like that without needing a subscription or a bundle. Say there’s some article about anything and I want to contact the author and be like “hey, here’s a related question, I’m willing to offer you X dollars to answer.” The person says “I’ll do it for x+10 dollars.” One site used to unlock articles to the public after a threshold of Bitcoin have been donated on a PPV basis. It both incentives the author and had a positive externality. Everyone is so invested in ads that they don’t work on technology and ideas to create new markets. To paraphrase Jaron Lanier we need to make technology so good it seduces away from destroying ourselves. Partly I want to complain that obviously I was using the quoted sentence as a rhetorical device. But I guess the whole point of that sentence and its paragraph was to argue against saying false things as a rhetorical device, so - hoist on my own petard, I guess. I’m less optimistic than Craig is about this solution, because it seems to me that socially virtuous technology will always be less fun/addictive than nonvirtuous technology, simply because the virtuous technology has to hit two targets (virtuous, fun/addictive), the nonvirtuous technology only has to hit one target, and it’s easier to optimize for a target with zero other constraints than with one other constraint. See eg Meditations on Moloch. ————————————————— Souf asks: Is there a convincing argument that AGI is possible within any reasonable timeframe (like... 50 years), other than the intuitions of esteemed AI researchers? Do they have any way to back up their estimates (of some tens of percent), and why they shouldn't be millionths of a percent? It is, as another poster said, an "extraordinary claim." I'd like to see some extraordinary support of those particular numbers. If I had to answer this question, I would point to the sorts of work AI Impacts does, where they try to estimate how capable computers were in 1980, 1990, etc, draw a line to represent the speed at which computers are becoming more capable, figure out where humans are at the same metric, and check the time when that line crosses however capable you’ve decided humans are. This is obviously really hard because you have to operationalize some definition of “capable” or “intelligent” or some other word that is hard to operationalize, but when you do it you usually get sometime in the mid-21st century. You’re going to point out that this argument doesn’t really qualify as “convincing”. I admit it doesn’t meet trial-by-jury standards of evidence. So I guess my real answer would be “it’s the #$@&ing prior”. Like, you certainly don’t have knock-down evidence that it’s impossible, I don’t have a knock-down evidence that it’s certain, so it might happen and it might not. How “might” are we talking? I don’t know, it would seem weird if this quickly-advancing technology being researched by incredibly smart people with billions of dollars in research funding from lots of megacorporations just reached some point and then stopped. Okay, fine, maybe it will keep advancing at the same rate, how fast is that in terms of time-to-AGI? Now we’re back at AI Impacts drawing lines again. The stupidest possible prior is always 50-50. We would have to be very stupid people to use the stupidest possible prior. But here we are. I wouldn’t want to give a 50-50 chance of us inventing FTL travel by 2100, because FTL travel seems physically impossible. I wouldn’t want to give a 50-50 chance of us inventing slower-than-light-but-still-pretty-good starships by 2100, because, I dunno, space travel isn’t advancing that fast and nobody is really working on it that hard. For AI, I don’t know, I kinda want to say 50-50. If I were going to try to update away from 50-50, I would want to look at AI Impacts style line graphs, expert opinion, and prediction markets. All of those seem to make me update up instead of down, so I don’t think I would go lower than 50-50. But there’s enough Knightian uncertainty to make an entire Round Table here, so who knows? Hardly a “convincing” argument, but I’m just trying to avoid the McAfee Fallacy: ————————————————— Souf continues: The argument that we are "in the middle of a period of extremely rapid progress in AI research, when barrier after barrier is being breached" makes it seem like all AI "progress" is on some sort of line that ends in AGI. That feels like sleight-of-hand. Even Scott himself refers to AGI here as a "new class of actor," so I'm failing to see how current lines of "progress" will indubitably result the emergence of something completely novel and different? Lots of smart people disagree with me on this one, but I think the path from here to AGI is pretty straight. I mean, it will take thousands of people who are all much smarter than I am to do it, but it’ll happen. My argument is something like - human brains are remarkably similar to rat brains, only much bigger. They’re still a little similar to insect brains. It looks like if you have a basic functioning brain, and you scale it up, it gets human intelligence. Existing AIs like AlphaGo or GPT seem to be basically a blob of learning-ability, a plan for pointing the blob at a specific problem, and lots and lots of training data. I think the past five years have shown that this basic model generalizes really well. OpenAI’s programs can now write essays, compose music, and generate pictures, not because they had three parallel amazing teams working on writing/music/art AIs, but because they took a blob of learning ability and figured out how to direct it at writing/music/art, and they were able to get giant digital corpuses of text / music / pictures to train it. DeepMind is finding that it can win lots of games, from Go to StarCraft to obstacle courses in simulated environments, by pointing a blob of learning-ability at the game and making it play against itself a zillion times (ie generate its own training data). My impression is that human/rat/insect brains are a blob of learning-ability which the rest of the nervous system successfully points at the world, and especially at aspects of the world that the organism needs to pay attention to (eg food sources, sex, etc). This isn’t exactly right, there are a few genetically-encoded programs, but not that many and it’s pretty hard. Right now I think our main advantages over AI systems are something like: our nervous system is pretty good at pointing us at the world and extracting training data from it. If you wanted an AI that learned being-in-the-world skills as well as we do, it would have to have an amazing robot body, and right now robot bodies aren’t that amazing.

Inline links: writes, Meditations on Moloch, Souf, the sorts of work AI Impacts does, https://substackcdn.com/image/fetch/$s_!3MgL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F7db78f49-9ccb-4b6e-ac18-cfb79f52cb04_584x232.png, not that many and it’s pretty hard

Fritz1 mentions across 1 issues

Fritz is a recurring software in the Astral Codex Ten archive, appearing 1 times across 1 issues between February 23, 2022 and February 23, 2022. The archive places it in contexts such as "I wanted to compare Fritz (which won WCCC in 1995)". It most often appears alongside AGI, AI Impacts, AIXI.

Reference entry: Fritz
Mention count: 1
Issue count: 1
First seen: February 23, 2022
Last seen: February 23, 2022

Outbound links

Issue trail

Biological Anchors: A Trick That Might Or Might Not Work February 23, 2022

AGI 1 shared issues
AI Impacts 1 shared issues
AIXI 1 shared issues
Ajeya 1 shared issues
Ajeya Cotra 1 shared issues

Source context

Biological Anchors: A Trick That Might Or Might Not Work

February 23, 2022 · Original source

I wanted to compare Fritz (which won WCCC in 1995) to a modern engine to understand the effects of hardware and software performance. I think the time controls for that tournament are similar to SF STC I think. I wanted to compare to SF8 rather than one of the NNUE engines to isolate out the effect of compute at development time and just look at test-time compute. So having modern algorithms would have let you win WCCC while spending about 50x less on compute than the winner. Having modern computer hardware would have let you win WCCC spending way more than 1000x less on compute than the winner. Measured this way software progress seems to be several times less important than hardware progress despite much faster scale-up of investment in software. But instead of asking "how well does hardware/software progress help you get to 1995 performance?" you could ask "how well does hardware/software progress get you to 2015 performance?" and on that metric it looks like software progress is way more important because you basically just can't scale old algorithms up to modern performance. The relevant measure varies depending on what you are asking. But from the perspective of takeoff speeds, it seems to me like one very salient takeaway is: if one chess project had literally come back in time with 20 years of chess progress, it would have allowed them to spend 50x less on compute than the leader. Response 2: AI Impacts + Matthew Barnett AI Impacts gathered and analyzed a dataset of who predicted AI when; Matthew Barnett helpfully drew in the line corresponding to Platt’s Law (everyone always predicts AI in thirty years). Just eyeballing it, Platt’s Law looks pretty good. But Holden Karnofsky (see below) objects that our eyeballs are covertly removing outliers. Barnett agrees this is worth checking for and runs a formal OLS regression. Platt’s Law in blue, regression line in orange. He writes: I agree this trendline doesn't look great for Platt's law, and backs up your observation by predicting that Bio Anchors should be more than 30 years out. However, OLS is notoriously sensitive to outliers. If instead of using some more robust regression algorithm, we instead super arbitrarily eliminated all predictions after 2100, then we get this, which doesn't look absolutely horrible for the law. Note that the median forecast is 25 years out. I’m split on what to think here. If we consider a weaker version of Platt’s Law, “the average date at which people forecast AGI moves forward at about one year per year”, this seems truish in the big picture where we compare 1960 to today, but not obviously true after 1980. If we consider a different weaker version, “on average estimates tend to be 30 years away”, that’s true-ish under Barnett’s revised model, but not inherently damning since Barnett’s assuming there will be some such number, it turns out to be 25, and Ajeya gave the somewhat different number of 32. Is that a big enough difference to exonerate her of “using” Platt’s Law? Is that even the right way to be thinking about this question? Response 3: Real OpenPhil The hypothetical OpenPhil in Eliezer’s mind having been utterly vanquished, the real-world OpenPhil is forced to step in. OpenPhil CEO Holden Karnofsky responds to Eliezer here. There’s a lot of back and forth about whether the report includes enough caveats (answer: it sure does include a lot of caveats!) but I was most interested in the attacks on Eliezer’s two main points. First, the point that biological anchors are fatally flawed from the start and measuring FLOP/S is no better than measuring power consumption in watts. Holden: If the world were such that: We had some reasonable framework for "power usage" that didn't include gratuitously wasted power, and measured the "power used meaningfully to do computations" in some important sense;

Inline links: AI Impacts, Matthew Barnett, https://substackcdn.com/image/fetch/$s_!17-W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa751f624-0392-4610-8a93-7bb94a60d1b3_1182x778.png, https://substackcdn.com/image/fetch/$s_!54Vh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F1c354075-ecaa-4807-a1a5-07931736f093_403x268.png, writes, https://substackcdn.com/image/fetch/$s_!dw02!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F797aef17-dc24-4845-9e00-2c3fd7f7dc32_403x268.png, here

GPT1 mentions across 1 issues

GPT is a recurring software in the Astral Codex Ten archive, appearing 1 times across 1 issues between August 06, 2021 and August 06, 2021. The archive places it in contexts such as "Existing AIs like AlphaGo or GPT seem to be basically a blob". It most often appears alongside AGI, AI, AI Impacts.

Reference entry: GPT
Mention count: 1
Issue count: 1
First seen: August 06, 2021
Last seen: August 06, 2021

Outbound links

Issue trail

Highlights From The Comments On Acemoglu And AI August 06, 2021

AGI 1 shared issues
AI 1 shared issues
AI Impacts 1 shared issues
AI risk 1 shared issues
algorithmic bias 1 shared issues

Source context

Highlights From The Comments On Acemoglu And AI

August 06, 2021 · Original source

My personal estimates are more like 75% chance, 25% chance, and a distribution that peaks about 20 years later than this one. I think the Metaculus position is consistent with all of “this probably won’t happen”, “THIS IS SUPER-TERRIFYING”, “this is most likely far away”, and “BUT FOR ALL WE KNOW IT COULD BE TOMORROW!” I realize this is an annoying way for things to be. ————————————————— CraigMichael writes: >But all the AI regulation in the world won’t help us unless we humans resist the urge to spread misinformation to maximize clicks. Was with you up to this point. There are several solutions to this other than willpower (resisting the urge). The basic idea - change incentives so that while spreading misinformation is possible but substantially less desirable/lucrative than other options for online behaviors. This isn’t so hard to imagine. Say there’s a lot of incentives to earn money online doing creative or useful things. Like Mechanical Turk, but less route behavior and more performing a service or matching needs. Like I wish I had a help desk for English questions where the answers were good and not people posturing to look good to other people on the English Stack Exchange, for example. I would pay them per call or per minute or whatever. Totally unexplored market AFAIK because technology hasn’t been developed yet. Another idea - Give people more options to pay at an article-level for information that’s useful to them or to have related questions answered or something like that without needing a subscription or a bundle. Say there’s some article about anything and I want to contact the author and be like “hey, here’s a related question, I’m willing to offer you X dollars to answer.” The person says “I’ll do it for x+10 dollars.” One site used to unlock articles to the public after a threshold of Bitcoin have been donated on a PPV basis. It both incentives the author and had a positive externality. Everyone is so invested in ads that they don’t work on technology and ideas to create new markets. To paraphrase Jaron Lanier we need to make technology so good it seduces away from destroying ourselves. Partly I want to complain that obviously I was using the quoted sentence as a rhetorical device. But I guess the whole point of that sentence and its paragraph was to argue against saying false things as a rhetorical device, so - hoist on my own petard, I guess. I’m less optimistic than Craig is about this solution, because it seems to me that socially virtuous technology will always be less fun/addictive than nonvirtuous technology, simply because the virtuous technology has to hit two targets (virtuous, fun/addictive), the nonvirtuous technology only has to hit one target, and it’s easier to optimize for a target with zero other constraints than with one other constraint. See eg Meditations on Moloch. ————————————————— Souf asks: Is there a convincing argument that AGI is possible within any reasonable timeframe (like... 50 years), other than the intuitions of esteemed AI researchers? Do they have any way to back up their estimates (of some tens of percent), and why they shouldn't be millionths of a percent? It is, as another poster said, an "extraordinary claim." I'd like to see some extraordinary support of those particular numbers. If I had to answer this question, I would point to the sorts of work AI Impacts does, where they try to estimate how capable computers were in 1980, 1990, etc, draw a line to represent the speed at which computers are becoming more capable, figure out where humans are at the same metric, and check the time when that line crosses however capable you’ve decided humans are. This is obviously really hard because you have to operationalize some definition of “capable” or “intelligent” or some other word that is hard to operationalize, but when you do it you usually get sometime in the mid-21st century. You’re going to point out that this argument doesn’t really qualify as “convincing”. I admit it doesn’t meet trial-by-jury standards of evidence. So I guess my real answer would be “it’s the #$@&ing prior”. Like, you certainly don’t have knock-down evidence that it’s impossible, I don’t have a knock-down evidence that it’s certain, so it might happen and it might not. How “might” are we talking? I don’t know, it would seem weird if this quickly-advancing technology being researched by incredibly smart people with billions of dollars in research funding from lots of megacorporations just reached some point and then stopped. Okay, fine, maybe it will keep advancing at the same rate, how fast is that in terms of time-to-AGI? Now we’re back at AI Impacts drawing lines again. The stupidest possible prior is always 50-50. We would have to be very stupid people to use the stupidest possible prior. But here we are. I wouldn’t want to give a 50-50 chance of us inventing FTL travel by 2100, because FTL travel seems physically impossible. I wouldn’t want to give a 50-50 chance of us inventing slower-than-light-but-still-pretty-good starships by 2100, because, I dunno, space travel isn’t advancing that fast and nobody is really working on it that hard. For AI, I don’t know, I kinda want to say 50-50. If I were going to try to update away from 50-50, I would want to look at AI Impacts style line graphs, expert opinion, and prediction markets. All of those seem to make me update up instead of down, so I don’t think I would go lower than 50-50. But there’s enough Knightian uncertainty to make an entire Round Table here, so who knows? Hardly a “convincing” argument, but I’m just trying to avoid the McAfee Fallacy: ————————————————— Souf continues: The argument that we are "in the middle of a period of extremely rapid progress in AI research, when barrier after barrier is being breached" makes it seem like all AI "progress" is on some sort of line that ends in AGI. That feels like sleight-of-hand. Even Scott himself refers to AGI here as a "new class of actor," so I'm failing to see how current lines of "progress" will indubitably result the emergence of something completely novel and different? Lots of smart people disagree with me on this one, but I think the path from here to AGI is pretty straight. I mean, it will take thousands of people who are all much smarter than I am to do it, but it’ll happen. My argument is something like - human brains are remarkably similar to rat brains, only much bigger. They’re still a little similar to insect brains. It looks like if you have a basic functioning brain, and you scale it up, it gets human intelligence. Existing AIs like AlphaGo or GPT seem to be basically a blob of learning-ability, a plan for pointing the blob at a specific problem, and lots and lots of training data. I think the past five years have shown that this basic model generalizes really well. OpenAI’s programs can now write essays, compose music, and generate pictures, not because they had three parallel amazing teams working on writing/music/art AIs, but because they took a blob of learning ability and figured out how to direct it at writing/music/art, and they were able to get giant digital corpuses of text / music / pictures to train it. DeepMind is finding that it can win lots of games, from Go to StarCraft to obstacle courses in simulated environments, by pointing a blob of learning-ability at the game and making it play against itself a zillion times (ie generate its own training data). My impression is that human/rat/insect brains are a blob of learning-ability which the rest of the nervous system successfully points at the world, and especially at aspects of the world that the organism needs to pay attention to (eg food sources, sex, etc). This isn’t exactly right, there are a few genetically-encoded programs, but not that many and it’s pretty hard. Right now I think our main advantages over AI systems are something like: our nervous system is pretty good at pointing us at the world and extracting training data from it. If you wanted an AI that learned being-in-the-world skills as well as we do, it would have to have an amazing robot body, and right now robot bodies aren’t that amazing.

Modern Stockfish1 mentions across 1 issues

Modern Stockfish is a recurring software in the Astral Codex Ten archive, appearing 1 times across 1 issues between February 23, 2022 and February 23, 2022. The archive places it in contexts such as "Modern Stockfish outperformed historical chess engines". It most often appears alongside AGI, AI Impacts, AIXI.

Reference entry: Modern Stockfish
Mention count: 1
Issue count: 1
First seen: February 23, 2022
Last seen: February 23, 2022

Outbound links

Issue trail

Biological Anchors: A Trick That Might Or Might Not Work February 23, 2022

AGI 1 shared issues
AI Impacts 1 shared issues
AIXI 1 shared issues
Ajeya 1 shared issues
Ajeya Cotra 1 shared issues

Source context

Biological Anchors: A Trick That Might Or Might Not Work

February 23, 2022 · Original source

Therefore, one of primary impact of new algorithms is to enable performance to continue scaling with compute the same way it did when you had smaller amounts. In this model, it makes sense to think of the "contribution" of new algorithms as the factor they enable more efficient conversion of compute to performance and count the increased performance because the new algorithms can absorb more compute as primarily hardware progress. I think the studies that Carl cites above are decent evidence that the multiplicative factor of compute -> performance conversion you get from new algorithms is smaller than the historical growth in compute, so it further makes sense to claim that most progress came from compute, even though the algorithms were what "unlocked" the compute. For an example of something I consider supports this model, see the LSTM versus transformer graphs in https://arxiv.org/pdf/2001.08361.pdf I also found Vanessa’s summary of this reply helpful: Hmm... Interesting. So, this model says that algorithmic innovation is so fast that it is not much of a bottleneck: we always manage to find the best algorithm for given compute relatively quickly after this compute becomes available. Moreover, there is some smooth relation between compute and performance assuming the best algorithm for this level of compute. [EDIT: The latter part seems really suspicious though, why would this relation persist across very different algorithms?] Or at least this is true is "best algorithm" is interpreted to mean "best algorithm out of some wide class of algorithms s.t. we never or almost never managed to discover any algorithm outside of this class". This can justify biological anchors as upper bounds[1]: if biology is operating using the best algorithm then we will match its performance when we reach the same level of compute, whereas if biology is operating using a suboptimal algorithm then we will match its performance earlier. Charlie Steiner objects: Which examples are you thinking of? Modern Stockfish outperformed historical chess engines even when using the same resources, until far enough in the past that computers didn't have enough RAM to load it. I definitely agree with your original-comment points about the general informativeness of hardware, and absolutely software is adapting to fit our current hardware. But this can all be true even if advances in software can make more than 20 orders of magnitude difference in what hardware is needed for AGI, and are much less predictable than advances in hardware rather than being adaptations in lockstep with it. And Paul Christiano responds: Here are the graphs from Hippke (he or I should publish summary at some point, sorry). I wanted to compare Fritz (which won WCCC in 1995) to a modern engine to understand the effects of hardware and software performance. I think the time controls for that tournament are similar to SF STC I think. I wanted to compare to SF8 rather than one of the NNUE engines to isolate out the effect of compute at development time and just look at test-time compute. So having modern algorithms would have let you win WCCC while spending about 50x less on compute than the winner. Having modern computer hardware would have let you win WCCC spending way more than 1000x less on compute than the winner. Measured this way software progress seems to be several times less important than hardware progress despite much faster scale-up of investment in software. But instead of asking "how well does hardware/software progress help you get to 1995 performance?" you could ask "how well does hardware/software progress get you to 2015 performance?" and on that metric it looks like software progress is way more important because you basically just can't scale old algorithms up to modern performance. The relevant measure varies depending on what you are asking. But from the perspective of takeoff speeds, it seems to me like one very salient takeaway is: if one chess project had literally come back in time with 20 years of chess progress, it would have allowed them to spend 50x less on compute than the leader. Response 2: AI Impacts + Matthew Barnett AI Impacts gathered and analyzed a dataset of who predicted AI when; Matthew Barnett helpfully drew in the line corresponding to Platt’s Law (everyone always predicts AI in thirty years). Just eyeballing it, Platt’s Law looks pretty good. But Holden Karnofsky (see below) objects that our eyeballs are covertly removing outliers. Barnett agrees this is worth checking for and runs a formal OLS regression. Platt’s Law in blue, regression line in orange. He writes: I agree this trendline doesn't look great for Platt's law, and backs up your observation by predicting that Bio Anchors should be more than 30 years out. However, OLS is notoriously sensitive to outliers. If instead of using some more robust regression algorithm, we instead super arbitrarily eliminated all predictions after 2100, then we get this, which doesn't look absolutely horrible for the law. Note that the median forecast is 25 years out. I’m split on what to think here. If we consider a weaker version of Platt’s Law, “the average date at which people forecast AGI moves forward at about one year per year”, this seems truish in the big picture where we compare 1960 to today, but not obviously true after 1980. If we consider a different weaker version, “on average estimates tend to be 30 years away”, that’s true-ish under Barnett’s revised model, but not inherently damning since Barnett’s assuming there will be some such number, it turns out to be 25, and Ajeya gave the somewhat different number of 32. Is that a big enough difference to exonerate her of “using” Platt’s Law? Is that even the right way to be thinking about this question? Response 3: Real OpenPhil The hypothetical OpenPhil in Eliezer’s mind having been utterly vanquished, the real-world OpenPhil is forced to step in. OpenPhil CEO Holden Karnofsky responds to Eliezer here. There’s a lot of back and forth about whether the report includes enough caveats (answer: it sure does include a lot of caveats!) but I was most interested in the attacks on Eliezer’s two main points. First, the point that biological anchors are fatally flawed from the start and measuring FLOP/S is no better than measuring power consumption in watts. Holden: If the world were such that: We had some reasonable framework for "power usage" that didn't include gratuitously wasted power, and measured the "power used meaningfully to do computations" in some important sense;

Inline links: https://arxiv.org/pdf/2001.08361.pdf, Vanessa’s summary, [1], Charlie Steiner, Modern Stockfish outperformed historical chess engines even when using the same resources, Paul Christiano, https://substackcdn.com/image/fetch/$s_!S8Yc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F96ae0981-cc53-4013-9eea-1d29d75f06ca_1456x1038.png, AI Impacts, Matthew Barnett, https://substackcdn.com/image/fetch/$s_!17-W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fa751f624-0392-4610-8a93-7bb94a60d1b3_1182x778.png, https://substackcdn.com/image/fetch/$s_!54Vh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F1c354075-ecaa-4807-a1a5-07931736f093_403x268.png, writes, https://substackcdn.com/image/fetch/$s_!dw02!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F797aef17-dc24-4845-9e00-2c3fd7f7dc32_403x268.png, here

NNUE engines1 mentions across 1 issues

NNUE engines is a recurring software in the Astral Codex Ten archive, appearing 1 times across 1 issues between February 23, 2022 and February 23, 2022. The archive places it in contexts such as "rather than one of the NNUE engines". It most often appears alongside AGI, AI Impacts, AIXI.

Reference entry: NNUE engines
Mention count: 1
Issue count: 1
First seen: February 23, 2022
Last seen: February 23, 2022

Outbound links

Issue trail

Biological Anchors: A Trick That Might Or Might Not Work February 23, 2022

AGI 1 shared issues
AI Impacts 1 shared issues
AIXI 1 shared issues
Ajeya 1 shared issues
Ajeya Cotra 1 shared issues

Source context

Biological Anchors: A Trick That Might Or Might Not Work

February 23, 2022 · Original source

I wanted to compare Fritz (which won WCCC in 1995) to a modern engine to understand the effects of hardware and software performance. I think the time controls for that tournament are similar to SF STC I think. I wanted to compare to SF8 rather than one of the NNUE engines to isolate out the effect of compute at development time and just look at test-time compute. So having modern algorithms would have let you win WCCC while spending about 50x less on compute than the winner. Having modern computer hardware would have let you win WCCC spending way more than 1000x less on compute than the winner. Measured this way software progress seems to be several times less important than hardware progress despite much faster scale-up of investment in software. But instead of asking "how well does hardware/software progress help you get to 1995 performance?" you could ask "how well does hardware/software progress get you to 2015 performance?" and on that metric it looks like software progress is way more important because you basically just can't scale old algorithms up to modern performance. The relevant measure varies depending on what you are asking. But from the perspective of takeoff speeds, it seems to me like one very salient takeaway is: if one chess project had literally come back in time with 20 years of chess progress, it would have allowed them to spend 50x less on compute than the leader. Response 2: AI Impacts + Matthew Barnett AI Impacts gathered and analyzed a dataset of who predicted AI when; Matthew Barnett helpfully drew in the line corresponding to Platt’s Law (everyone always predicts AI in thirty years). Just eyeballing it, Platt’s Law looks pretty good. But Holden Karnofsky (see below) objects that our eyeballs are covertly removing outliers. Barnett agrees this is worth checking for and runs a formal OLS regression. Platt’s Law in blue, regression line in orange. He writes: I agree this trendline doesn't look great for Platt's law, and backs up your observation by predicting that Bio Anchors should be more than 30 years out. However, OLS is notoriously sensitive to outliers. If instead of using some more robust regression algorithm, we instead super arbitrarily eliminated all predictions after 2100, then we get this, which doesn't look absolutely horrible for the law. Note that the median forecast is 25 years out. I’m split on what to think here. If we consider a weaker version of Platt’s Law, “the average date at which people forecast AGI moves forward at about one year per year”, this seems truish in the big picture where we compare 1960 to today, but not obviously true after 1980. If we consider a different weaker version, “on average estimates tend to be 30 years away”, that’s true-ish under Barnett’s revised model, but not inherently damning since Barnett’s assuming there will be some such number, it turns out to be 25, and Ajeya gave the somewhat different number of 32. Is that a big enough difference to exonerate her of “using” Platt’s Law? Is that even the right way to be thinking about this question? Response 3: Real OpenPhil The hypothetical OpenPhil in Eliezer’s mind having been utterly vanquished, the real-world OpenPhil is forced to step in. OpenPhil CEO Holden Karnofsky responds to Eliezer here. There’s a lot of back and forth about whether the report includes enough caveats (answer: it sure does include a lot of caveats!) but I was most interested in the attacks on Eliezer’s two main points. First, the point that biological anchors are fatally flawed from the start and measuring FLOP/S is no better than measuring power consumption in watts. Holden: If the world were such that: We had some reasonable framework for "power usage" that didn't include gratuitously wasted power, and measured the "power used meaningfully to do computations" in some important sense;

SF81 mentions across 1 issues

SF8 is a recurring software in the Astral Codex Ten archive, appearing 1 times across 1 issues between February 23, 2022 and February 23, 2022. The archive places it in contexts such as "I wanted to compare to SF8 rather than one of the NNUE engines". It most often appears alongside AGI, AI Impacts, AIXI.

Reference entry: SF8
Mention count: 1
Issue count: 1
First seen: February 23, 2022
Last seen: February 23, 2022

Outbound links

Issue trail

Biological Anchors: A Trick That Might Or Might Not Work February 23, 2022

AGI 1 shared issues
AI Impacts 1 shared issues
AIXI 1 shared issues
Ajeya 1 shared issues
Ajeya Cotra 1 shared issues

Source context

Biological Anchors: A Trick That Might Or Might Not Work

February 23, 2022 · Original source

I wanted to compare Fritz (which won WCCC in 1995) to a modern engine to understand the effects of hardware and software performance. I think the time controls for that tournament are similar to SF STC I think. I wanted to compare to SF8 rather than one of the NNUE engines to isolate out the effect of compute at development time and just look at test-time compute. So having modern algorithms would have let you win WCCC while spending about 50x less on compute than the winner. Having modern computer hardware would have let you win WCCC spending way more than 1000x less on compute than the winner. Measured this way software progress seems to be several times less important than hardware progress despite much faster scale-up of investment in software. But instead of asking "how well does hardware/software progress help you get to 1995 performance?" you could ask "how well does hardware/software progress get you to 2015 performance?" and on that metric it looks like software progress is way more important because you basically just can't scale old algorithms up to modern performance. The relevant measure varies depending on what you are asking. But from the perspective of takeoff speeds, it seems to me like one very salient takeaway is: if one chess project had literally come back in time with 20 years of chess progress, it would have allowed them to spend 50x less on compute than the leader. Response 2: AI Impacts + Matthew Barnett AI Impacts gathered and analyzed a dataset of who predicted AI when; Matthew Barnett helpfully drew in the line corresponding to Platt’s Law (everyone always predicts AI in thirty years). Just eyeballing it, Platt’s Law looks pretty good. But Holden Karnofsky (see below) objects that our eyeballs are covertly removing outliers. Barnett agrees this is worth checking for and runs a formal OLS regression. Platt’s Law in blue, regression line in orange. He writes: I agree this trendline doesn't look great for Platt's law, and backs up your observation by predicting that Bio Anchors should be more than 30 years out. However, OLS is notoriously sensitive to outliers. If instead of using some more robust regression algorithm, we instead super arbitrarily eliminated all predictions after 2100, then we get this, which doesn't look absolutely horrible for the law. Note that the median forecast is 25 years out. I’m split on what to think here. If we consider a weaker version of Platt’s Law, “the average date at which people forecast AGI moves forward at about one year per year”, this seems truish in the big picture where we compare 1960 to today, but not obviously true after 1980. If we consider a different weaker version, “on average estimates tend to be 30 years away”, that’s true-ish under Barnett’s revised model, but not inherently damning since Barnett’s assuming there will be some such number, it turns out to be 25, and Ajeya gave the somewhat different number of 32. Is that a big enough difference to exonerate her of “using” Platt’s Law? Is that even the right way to be thinking about this question? Response 3: Real OpenPhil The hypothetical OpenPhil in Eliezer’s mind having been utterly vanquished, the real-world OpenPhil is forced to step in. OpenPhil CEO Holden Karnofsky responds to Eliezer here. There’s a lot of back and forth about whether the report includes enough caveats (answer: it sure does include a lot of caveats!) but I was most interested in the attacks on Eliezer’s two main points. First, the point that biological anchors are fatally flawed from the start and measuring FLOP/S is no better than measuring power consumption in watts. Holden: If the world were such that: We had some reasonable framework for "power usage" that didn't include gratuitously wasted power, and measured the "power used meaningfully to do computations" in some important sense;

Vim1 mentions across 1 issues

Vim is a recurring software in the Astral Codex Ten archive, appearing 1 times across 1 issues between May 08, 2022 and May 08, 2022. The archive places it in contexts such as "Comment of the week is hiddenhare on UX (and the thread below). But also, Vim". It most often appears alongside Astralcodexten Com, Book Review Contest, bulletin board.

Reference entry: Vim
Mention count: 1
Issue count: 1
First seen: May 08, 2022
Last seen: May 08, 2022

Outbound links

Issue trail

Open Thread 223 May 08, 2022

Astralcodexten Com 1 shared issues
Book Review Contest 1 shared issues
bulletin board 1 shared issues
Discord 1 shared issues
hiddenhare 1 shared issues

Source context

Open Thread 223

May 08, 2022 · Original source

.... In this week’s news: 1: Based on your ratings, I’ve selected twelve finalists for the Book Review Contest, and will be posting one every Friday from now until July. 2: Comment of the week is hiddenhare on UX (and the thread below). But also, Vim : 3D video games are running enough math to compute and draw an entire three-dimensional world with tens of millions of triangles and complex interacting physics, and th...

Astral Codex Ten

Table of Contents

Atlas

Software

Software

Reference Index

Backlinks