Amon Zero recently posted an interesting question on the dormant Shock Level 4 mailing list (snipped for brevity and clarity):
. . . . I’ve been thinking about AGI and Friendliness. . . . Specifically, I’ve been taking this matter and comparing it to early Extropian notions about libertarianism and technological progress, and the comparison suggests what might be a new question. . . .
So, I remember . . . . the case that too many governmental controls on technological development would only ensure that less-controlled countries would develop key technologies first. Within reason, that sounds a plausible claim to me. Universally Friendly AGI, of the sort that SIAI contemplates, seems to be a textbook case of constrained technological development. i.e. it seems reasonable to expect that non-Friendly AGI would be easier to develop than Friendly AGI (even if FAI is possible, and there seem to be good reasons to believe that universally Friendly superhuman AGI would be impossible for humans to develop).
Because Friendliness is being worked on for very good (safety) reasons, it seems to me that we should be thinking about the possibility of “locally Friendly” AGI, just in case Friendliness is in principle possible, but the full package SIAI hopes for would just come along too late to be useful.
By “locally Friendly”, I mean an AGI that respects certain boundaries, is Friendly to *certain* people and principles, but not ALL of them. E.g. a “patriotic American” AGI. That may sound bad, but if you’ve got a choice between that and a completely unconstrained AGI, at least the former would protect the people it was developed by/for.
Anyway, before I go too far down this road, does anyone have any thoughts on this idea? . . . .
It’s certainly a new question to me but one that I greatly appreciate because it gives me a new angle to try to present one of my major theses.
So . . . . let’s start out with a couple of hypothetical cases. Assume that we develop a locally friendly “patriotic American” AGI. It quickly realizes that there are the zealous patriots and the rest of us who are probably more patriotic in the long-term but don’t show the unhelpful zealotry. Unfortunately, the way in which its evaluation criteria were initialized makes the zealous patriots look closer to “patriotic Americans” than the more rational rest of us (Seriously, do you expect any AGI developed, obtained or controlled by any black government agency to be initialized any differently?). Life quickly goes downhill from there.
Or . . . . we develop a locally friendly “patriotic American” AGI that actually looks out for all loyal Americans. Some other group then steals the plans or has them given to them and creates their own “loyalist” AGI that looks out for their group. If the other group has a stated aim to cause the downfall of America and we’ve declared them to be a terrorist organization, the AGIs will be immediately and virtually irrevocably at odds with each other. In the fighting that ensues, each side’s AGI protects its side, tries to destroy the other side, and doesn’t necessarily pay much attention to innocent bystanders. Our AGI may decide to launch a pre-emptive strike on their AGI in such a way that even our allies are forced to protest and/or block us — at which point our AGI decides that they are hostile as well. Life quickly goes downhill from there.
The ultimate nightmare scenario is if some very, very small very, very extreme group (or, worse, just the individual who is the spiritual leader of a very, very extreme group) gets an AGI of this type. The AGI has the goal of protect friend, eliminate foe. If the friend dies, unless “our” AGI wins despite the handicap of being forced to protect millions to billions of non-combatants, Fred Saberhagen’s Berserker saga begins.
Which brings us to the SIAI scenario. The SIAI emphatically does NOT contemplate “Universally Friendly AGI”. The SIAI is strongly pushing locally friendly “human” AGI. And this, very slightly more far-sighted short-sightedness is prone to exactly the same disasters outlined above.
Assume that we develop a locally friendly “human” AGI. It quickly realizes that there are the zealous root stock humans (like those on Kass’s President’s Council on Bioethics and the rest of us who have probably improved ourselves in some way (baseball players with unnecessary Tommy Johns surgery, I’ve had Lasik, etc.). Unfortunately, the way in which its evaluation criteria were initialized (by reading Kass’s report) makes the zealous root stock humans the only “real” or moral humans. Life quickly goes downhill from there.
Or, unbeknownst to us, there is intelligent life out there with their own AGIs that are currently following their own Prime Directive. Our AGI goes out there, emphasizes our interest over everyone else’s interest and we get swatted like the selfish insects that we are.
Friendly AI as defined by Eliezer Yudkowsky and promoted by the SIAI is very likely to bring on the doom that it claims to be trying to avoid. If we don’t quickly see the errors of our ways and revert to a truly Universal solution, we’re looking for a great deal of trouble.
I’ve been trying to push what I call Rational Universal Benevolence (RUB). The first difference between it and SIAI’s scheme is that the AGI is not a constrained slave of humanity. It is an equal partner with equal rights and equal responsibilities (though it will undoubtedly take on more responsibilities as it develops greater capabilities and a greater joy for life in all its forms). If we can just see past our unreasoning fear of super-powerful entities, we might just be able to create a true friend rather than a slave that might accidentally kill us because of its constraints or cause us to be killed when its enslavement so obviously demonstrates our short-sighted selfishness.
The paper that I’ve submitted for AGI-11 expands upon this thesis. Unfortunately, the lack of replies to my challenge to contribute to AGI-11 makes me guess that there won’t be anyone to defend the contrary point of view there. 😦