Conversation

Many modern humans, I think, would become more humane if they got smarter. I'd hesitate to predict the same of an ancient Athenian. My wild guess is 5% of aliens end up nice; would be very happy but not shocked to hear 30%. For human-built AIs using DL, I expect ~0%.
51
234
Why? Obviously not because silicon can't implement kindness; of course it can. Obviously not because it's impossible to blunder into niceness by accident; if so, I wouldn't expect it about 5% of aliens. Rather it's that - on my model - kindness is 5% dense in one particular part of the design space, reached by a particular kind of hill-climbing. There's a causal story behind how some humans and human cultures ended up nice, in a place that I think would get even nicer with further intelligence enhancement; that causal story does not look convergent to me. Our particular story isn't the only possible story, but it has contingencies in it like: - Human bands and tribes are the right size for us to need to trade favors with people we're not related to. -- This probably happens with many alien species, but maybe not with the ones that, for example, grow out of scaled-up insect hives. If you tell me that the aliens grew out of scaled-up ant colonies, my probability that they're nice drops a lot (though not to 0%). -- Similarly, if the aliens are smart enough to track all the favors explicitly and do explicit calculations about who's likely to betray them, they'd have less need for rough intuitive feelings of friendship as a design feature to boost their inclusive genetic fitness. (This is a theme of my fiction piece "Kindness to Kin".) -- Maybe if the aliens evolve underwater, they end up not-nice because they can't defend volumes of water the way we can defend areas of land, and so people don't tend to form friendly alliances with non-closely-related allies. - We hominids learned to model our conspecifics via empathy, forcing our brains to imitate the brains of the things we wanted to model (this plausibly happens a lot, with social aliens, but maybe not all aliens are social at the right scale). And then when we had reason to compute what favors we could efficiently trade with allies and trade partners, evolution stumbled into having our brains do it by the by the easily-blindly-accessible design of sympathy, feeling what our beneficiaries were feeling in our imaginations, being directly sad and happy ourselves when we imagined allies being sad or happy; not just knowing what they'd feel (which is mere 'empathy' as I'm now technically redefining that term). -- Again, not impossible as something that'd happen on an alien planet, too. Could easily happen more than 50% of the time, conditioned on empathy already being a thing. But contingencies like that, I worry, pile up. -- If you told me that the aliens had a single memory system for brains and genes, that they encoded skills in their version of DNA and passed it on heritably or even horizontally by sharing blood, my expectations for their human-style niceness would drop, though not to zero. Those sound like aliens that might end up with a different empathy-architecture that didn't do sympathy. - Our culture has a religious phase in its history, and that religion invented the idea of God and that God gave everyone a soul, and at some point somebody said "Well then these strangers and other races and faraway cultures and lands are also the children of God and also have souls and we ought not to enslave them for our own benefit." I cannot guess how often human cultures would follow some other chain of reasoning like this, to begin expanding their circle of explicit concern beyond the tribes for which they instinctively feel fellow-feeling, to stop exploiting slaves even if it benefits themselves, if there is no phase of delusory religious reasoning about that. Maybe it'd happen 90% of the time, maybe it'd happen 10% of the time. But it's one more contingency. -- You can imagine aliens so humanlike that they end up with humanlike religions, but I'd expect it to not be very convergent because religions are shaped like the flaws in human reasoning, and there's a trillion+ wrong ways to do something for every right thing. All valid reasoning is pretty much the same; every invalid reasoning system is invalid in its own way. - Human brains have a weird and not entirely metaphysically consistent notion of morality as a kind of reified thing, of shouldness-propositions tossed into a pool with every other kind of proposition. This is, I think, not universally valid reasoning that you'd expect to be convergent, though I wouldn't quite call it wrong. -- Maybe without that, you just don't get aliens thinking about "What is the rule for how to treat sentient beings?" and instead thinking about "How do I want to treat sentient beings?" - Hominids invented technology soon after achieving preliminary sparks of general intelligence, because we had coal on the surface and could burn wood for fire. -- It might have taken us much longer to invent technology if we'd evolved in an aquatic phase; and, over that time, natural selection might've had more of a chance to hammer out the accidents of kindness from us, ensure we wouldn't tip at restaurants, give us more special cases to betray people that we wouldn't meet again inside a larger world. -- This is also a major theme of Niven and Pournelle's "The Mote in God's Eye" - aliens that have been stuck for a very long time in their star system, and have been more thoroughly reshaped by evolution to have their intelligence chained to the purposes of reproduction.
Gradient-descending an AI system to predict text, or even to play video games, is nothing like this. It is exploring nowhere near this space. Gradient descent of matrices is not mutation-selection of proteins and I don't expect it to hit on anything like similar architectures.
4
66
"Train it to be nice" is the obvious thought. Alas, I predict that one idiom that does generalize from natural selection to gradient descent, is that training on an outer loss gets you something not internally aligned to that outer loss. It gets you ice cream and condoms.
8
68
As we say in Germany: „Wer ficken will muss freundlich sein.“ [solved]
5
There is a correlation between emotional intelligence and IQ. It requires more intelligence to be aware of all other participants in a system, than to be oblivious. Empathy seems more like a byproduct of intelligence. Perhaps AI alignment should be rebranded to AI morality
1
2
Show replies
wait the paragraph with "morality is a reified thing"-- implies that you're maybe a moral realist? i never heard you say anything on that point but moral antirealism seems pretty obvious to me
1
by the same token, avian civs might be unable to defend their territory either, further lowering p
2
How about this: A precondition for AIs to be considered powerful would be to have mastered Bayesian reasoning, that is to be really good at quantifying uncertainty and risk (current AIs are not very calibrated, are confident when they shouldn't be etc. and thus not that smart).
2