Ben Goertzel has made an awesome post titled “The Singularity Institute’s Scary Idea (and Why I Don’t Buy It)” on his blog and the discussion is raging there and on a newly started top-level post at LessWrong (albeit with a lot more distracting and/or non-rational digressions).
I’d like to cross-post my initial comment (broken there into two comments by character count limits) below (with a bunch of extra links since I couldn’t edit over there to add them).
I’ll start with a response to FrankAdamek’s thoughtful comments. (By the way, like Ben, I have read the vast majority of Yudkowsky’s writing — feel free to point to them to illustrate your arguments for a given point but don’t be surprised if I come back with counter-arguments to a number of his assumptions and, please, don’t wave generally in the direction of a whole sequence and expect me to pick out or agree with your particular point).
Frank, if you put responsible between “other” and “AGI”, I’ll agree with the assumption that “The goal of SIAI (and every other AGI researcher for that matter) is to make an AGI that is safe.” The problem is both with the definition of safe and with the fact that SIAI also claims that an “unsafe” AI will almost inevitably lead to unFriendly AI.
SIAI defines safe as “won’t wipe out humanity”. I think that it is virtually provable that this goal is inherently self-contradictory. I am, apparently, an Unfriendly Human because I can certainly conceive of circumstances where my value system would accept the elimination of humanity (think “a choice between humanity and the annihilation of several more numerous and more advanced civilizations forced by humanity’s aggression or other shortcomings”). I believe that a mind that will not accept the elimination of humanity under any circumstances is fatally flawed.
SIAI is not unique in its concern about anthropomorphism and human bias. It is unique in the stridency of its belief that über rationality is the ONLY solution. There is a lot of scientific research (referenced in a number of my papers) that shows that even “rational” human thinking is compromised by bad assumptions, immediate dismissal of arguments before comprehension (the normal LessWrong denizen will generally use the pejorative “confused” when doing this), overlooking contradicting facts or refusing to accept rational counter-arguments, etc. SIAI devotes far too much time to setting up and knocking down so-called “strawman” arguments (who is going to generate “a random mind”?).
Steve Omohundro’s paper was awesome. Unfortunately, it also goes totally off the rails midway through when he starts making incorrect assumptions after overlooking some contradicting facts. Humans are goal-driven entities yet the majority of us do NOT behave like psychopaths even when we are positive we can get away with it (i.e. when the social structure doesn’t force us to behave). Missing this point means missing the path to the solution (I’ve written a lot more about this here, here, and here.
I particularly vehemently disagree with Eliezer’s characterizations of the human value system as “fragile” and the belief that the fact that the human value system is complex is any sort of a show-stopper. As I argued in my AGI ’10 presentation, the best analogy to the human value system is the Mandelbroit set as viewed by a semi-accurate, biased individual that is also subject to color illusions (as are we all). The formula for the Mandelbroit set is rat simple — yet look at the beautiful complexity it leads to.
I (and the social psychologists) believe that the foundational formula of ethics (Friendliness) is quite simple — Do that which is necessary to optimize the possibility of co-existence (it’s the categorical imperative that Kant was looking for and arguably a solution to Eliezer’s CEV question). It is very close to what the SIAI is proposing but there are absolutely critical distinctions — the most major/obvious being “If humanity is a danger to the co-existence of everyone else and wiping them out is the only way to reduce that threat below a near unitary value, the above imperative will dictate its destruction.”
I also firmly believe that the above imperative is an attractor that Omohundro should have recognized, that causes most of the “random minds” in Ben’s point 1 to converge to something like human ethics, that makes human values robust and regenerative (as opposed to fragile) and makes it extremely unlikely that an AGI with it will diverge into unFriendly territory (or permit others to do so — see altruistic punishment — which, by the way, though, could be very bad for humans if we absolutely insist on our evil ways).