AI Safety Needs Social Scientists

Properly aligning advanced AI systems with human values will require resolving many uncertainties related to the psychology of human rationality, emotion, and biases. These can only be resolved empirically through experimentation — if we want to train AI to do what humans want, we need to study humans.

This is the first publication on Distill in quite some time, and it's by OpenAI. It's an interesting topic: to the extent that we care about "AI alignment" (and we should...), we need to know a lot more about ourselves before reliably being able to express our own optimization functions.

As always, OpenAI's work is quite long-term focused, but I find it relevant to pay attention to what they're thinking about. They're living in a future that the rest of us will live in in coming years.


Want to receive more content like this in your inbox?