decision theory
Article
decision theory is a recurring concept in the Astral Codex Ten archive, appearing 2 times across 2 issues between March 14, 2023 and November 08, 2024. The archive places it in contexts such as “Eliezer Yudkowsky worries that supercoherent superintelligences will have access to better decision theories”; “Eliezer Yudkowksy’s decision theory work”. It most often appears alongside Eliezer Yudkowsky, AI Impacts, Air Force.
Metadata
- Category: Concepts
- Mention count: 2
- Issue count: 2
- First seen: March 14, 2023
- Last seen: November 08, 2024
Appears In
Related Pages
-
- Eliezer Yudkowsky (2 shared issues)
-
- AI Impacts (1 shared issues)
-
- Air Force (1 shared issues)
-
- Harris administration (1 shared issues)
-
- CDT (1 shared issues)
-
- CIA (1 shared issues)
-
- Democrats (1 shared issues)
-
- Discontinuous Progress In History (1 shared issues)
-
- Discord (1 shared issues)
-
- Donald Trump (1 shared issues)
-
- Drexler-Smalley debate (1 shared issues)
-
- Einstein (1 shared issues)
External Links
Source Context
Recovered passages from the original issue text. When the raw archive preserved outbound links inside the source passage, they are listed directly under the quote.
Eliezer Yudkowsky worries that supercoherent superintelligences will have access to better decision theories than humans - mathematical theorems about cooperation which let them make and prove binding commitments with each other in the absence of explicit coordination. Not only would this prevent us from intercepting their coordination, but it would be such an advantage that humans (who can’t do this) would be locked out of possible alliances. I agree that if this were true it would be a very bad omen. But human geniuses don’t seem able to do this, so maybe we can re-use the Optimist’s Case above with decision theory as the world-killing technology.
In retrospect, maybe I’m erring by using intuitions I got from Eliezer Yudkowksy’s decision theory work, intended for bargaining with literally-galaxy-brained superintelligences who might respond with things like “Sorry, I’ve already pre-committed to rejecting all offers that would seem like extortion to omniscient entities negotiating from behind a veil of ignorance, and if you think about it carefully you’ll realize that this is fair enough that your own set of galaxy-brained logically-perfect pre-commitments don’t require you to retaliate against me for doing this”. This is a good strategy if you can pull it off, and it forces you to pay a two-thirds tax to place yourself in a bin of slightly-higher-cooperativeness. But Kamala Harris probably hasn’t done this, maybe hasn’t even done any instinctual thing which cashes out to the equivalence of this, and maybe doesn’t respond differently to the outright extortion of “do what I want or I’ll vote Trump” or the massaged-to-fit-a-series-of-fair-precommitments offer of “do what I want or I’ll vote Trump with 33% probability”. In fact, IIUC Kamala hasn’t shown any inkling that these people exist at all (which could itself be a powerful game theoretic strategy!)
So, first of all, if you want a sensible analysis of this, you're gonna have to use logical decision theory instead of causal decision theory, or something that ends up equivalent to LDT by talking about a CDT agent who wants a "good reputation" meaning they always behave like LDT. Worse than that, you're going to have to jump ahead to using folk theorems of LDT that seem like they ought to be proven someday but which we currently lack the representational framework to prove. If you use conventional classical academically standard causal decision theory, there's no notion of "fairness", there is just accepting an offer of $1 in the Ultimatum Game being called "rational", and so Harris should offer Muslims policy the bare minimum better than Trump and Muslims should accept it. This is almost directly isomorphic to the Ultimatum Game, on which the classic causal decision theory answer is "offer $1 and accept $1, for this alone is Rational".
With that said, of course, threats can make good decision-theoretic sense when you are dealing with another agent that is bad at decision theory. Anybody who tries offering you $1 on the Ultimatum Game is probably also a sort of agent that will offer you $10 in the Ultimatum Game if you set up a doomsday nuke that goes off otherwise.