Anders Sandberg

Article

Anders Sandberg is a recurring person in the Astral Codex Ten archive, appearing 2 times across 2 issues between July 04, 2022 and February 27, 2025. The archive places it in contexts such as “speakers including … Anders Sandberg”; “Anders Sandberg proposes”; “Anders Sandberg proposes (X) that maybe”. It most often appears alongside Hanania, Sam Altman, /r/NootropicsDepot.

Metadata

Category: People
Mention count: 2
Issue count: 2
First seen: July 04, 2022
Last seen: February 27, 2025

Appears In

- Hanania (2 shared issues)
- Sam Altman (2 shared issues)
- NootropicsDepot (1 shared issues)
- @fae_dreams (1 shared issues)
- @ObhishekSaha (1 shared issues)
- @xlr8harder (1 shared issues)
- ABC testing (1 shared issues)
- ACX (1 shared issues)
- Aella (1 shared issues)
- Africa (1 shared issues)
- AGI (1 shared issues)
- Alkemist Labs (1 shared issues)

External Links

Source Context

Recovered passages from the original issue text. When the raw archive preserved outbound links inside the source passage, they are listed directly under the quote.

Open Thread 231

July 04, 2022 · Original source

2: Isaak Freeman asks me to signal-boost the Future Forum, from August 4-7 in San Francisco, featuring speakers including Sam Altman, Anders Sandberg, Patrick Collison, and Tyler Cowen. They are bringing together 250 people from EA, Silicon Valley, and related communities to “arm the world's brightest minds with the tools they need to tackle global problems” and to potentially offer funding and mentoring. Apply at the link above.

Inline links: Future Forum

Links For February 2025

February 27, 2025 · Original source

5: Surprising AI safety result: if you fine-tune an AI to write deliberately insecure code, the AI becomes evil in every other way too (eg it will name Hitler as its favorite person and recommend the user commit suicide). Anders Sandberg proposes (X) that maybe “it is shaped by going along a vector opposite to typical RLHF training aims, then playing a persona that fits”. Eliezer calls it (X) “possibly the best AI news of 2025 so far. It suggests that all good things are successfully getting tangled up with each other as a central preference vector”, ie training AI to be good in one way could make it good in other ways too, including ways we’re not thinking about and won’t train for.

Inline links: Surprising AI safety result, proposes (X), calls it (X)

Astral Codex Ten

Table of Contents

Atlas

Anders Sandberg

Anders Sandberg

Article

Metadata

Appears In

External Links

Source Context

Backlinks

Astral Codex Ten

Table of Contents

Atlas

Anders Sandberg

Anders Sandberg

Article

Metadata

Appears In

Related Pages

External Links

Source Context

Backlinks