Conversation

Replying to and
Finished. It’s a good overview, in fact. The field has dramatically matured and improved in the past few years, btw. The stupid stuff that dominated LW 3+ years ago is discredited. There’s now much that’s plausible and interesting. Surprised!
1
6
Replying to
Any particular line of research you found particularly promising? I have a strong intuition that the whole enterprise is misconceived, but there are many smart people who think otherwise, so I'm probably wrong.
1
1
Replying to and
Quick browse suggests the part that’s good is just the label “alignment” being slapped onto ordinary engineering improvements, not the earlier esoteric stuff. Ie the part that’s good is not alignment, the part that’s alignment is not good. I’ve been expecting precisely this.
3
6
Replying to and
Yes, I mostly agree with this, and my write-up will say so. On the whole, the field has concluded that “alignment” is infeasible (although a holy grail still worth putting some effort into), and doing sensible, fairly obvious things is a better bet (but comes with no guarantees).
3
3
Replying to and
Prompted a thought: is "alignment" just nerdspeak for human states like empathy, compassion, solidarity? Humans are often non-aligned but we have those concepts to describe the happy times where our goals do not conflict.
1
1
Replying to and
Fwiw I think those human-centric terms are equally unsatisfactory because they locate the behavioral roots in individual traits. What I think comes closest is Hannah Arendt notion of freedom-in-mutualism, which I think of as entangled infinite-game I-it and I-thou relations
Replying to and
Humans can be highly misaligned, even diametrically so, but so long as they don’t quit the mutual entanglement or try to win absolutely through killing, the collective is in a healthy place
1
Replying to and
It’s easier to define the opposite actually: freedom in mutualism is the opposite of irreversible hard-forking. N&S conditions for FiM a) play to continue the game rather than win b) recognize other agents as agents c) have a non-terminating relationship with facts
1
Show replies