Posted by: Mark Waser | Jun 10, 2010

A Dangerous “Friend” Indeed


“Friendly” AI, as currently proposed by the Singularity Institute for Artificial Intelligence (SIAI), is akin to forcing a one-way drain into a structurally unsound lifeboat.  The drain itself can perform absolutely correctly and be perfectly fool-proof yet still be the proximate cause of the unnecessary sinking of the boat as the hull disintegrates around it.  In this case, the lifeboat is morality which the SIAI has despaired of understanding adequately and thus has opted to propose hammering slavery through it to “solve” the problems of humanity.

Morality *is* a complex and difficult subject.  Humans are biologically “programmed” to react vehemently to moral issues.  Further, our moral system is “firewalled” from our reasoning system both for individual advantage (see Trivers) and because reasoned/logical manipulation of the moral system to find “loopholes” (like “Friendly” AI isn’t really slavery) is detrimental to morality.  Despite these difficulties, we NEED to develop a science of morality before we fatally compromise that which is keeping us alive/afloat.

Many people claim that we can’t develop a science of morality, that the variances of moral beliefs across cultures and circumstances preclude one coherent morality, and that David Hume and the “Is/Ought” divide will forever separate science and morality.  I claim that we simply haven’t built and/or agreed upon the necessary thought structures to turn morality into an object which is amenable to scientific study — YET.

Over two millennia ago, Aristotle pointed out that you could regard the adult plant that a seed produced as the “final” cause or telos of that seed (i.e. both the reason for and the purpose of it’s existence).  The correct modern day reasoning for this runs as follows:  if the seed did not produce an adult plant then the previous adult plant that produced the seed would not have evolved to produce that particular seed and therefore the seed would not exist.  A similar argument applies to organs and senses like the eye.  The eye evolved to provide the function of “seeing” because seeing enhances the evolutionary “fitness” of the possessing organism.  Similarly, humans have a “moral sense” because the function that it provides enhances evolutionary fitness.  There is no denying that there is something below the conscious rational mind that senses and tells us when something is moral or when it is immoral.

Evolution then is the bridge across the Is/Ought divide.  An eye has the purpose or goal of seeing.  Once you have a goal or purpose, what you “ought” to do IS make those choices which have the highest probability of fulfilling that goal/purpose.  If we can tease apart the exact function/purpose/goal of morality from exactly how it enhances evolutionary fitness, we will have an exact scientific description of morality — and the best method of determining that is the scientific method.

Now, it is entirely conceivable that what we perceive as being a single concept or entity (morality) is actually a collection of concepts/entities that we can’t distinguish between.  But a solid scientific investigation should provide us with the necessary background to begin to be able to do so and will provide the same benefits whether the subject is singular or multiple.  Further, a preliminary investigation reveals that cultural variance can be explained by a single simple model.

One scientific fact that is indisputable is that human beings have goals.  Another fact (first pointed out in slightly different form by Steve Omohundro) is that all goals have the same “universal” sub-goals which promote their fulfillment in all circumstances except those where the sub-goal directly contradicts the original goal.  Omohundro listed six goals which I have modified slightly and added several more:

  • self-preservation
  • goal-preservation
  • goal-evaluation correctness
  • rationality/efficiency
  • self-improvement
  • gain/preserve access to resources
  • gain/preserve knowledge
  • gain/preserve cooperation
  • gain/preserve freedom

The exact list and formulation of the “universal” sub-goals is obviously open to debate but pursuing each of them clearly promotes evolutionary fitness.  One, in particular, deserves attention as the apparent “goal/purpose” of morality — to gain/preserve cooperation.  Further, all of the “universal” sub-goals deserve attention because the best way to gain/preserve cooperation is not to block them (if not actively assist them) for other entities.  In fact, observation will show that all “moral” judgments come back to assisting or blocking one or more of these “universal” sub-goals for another entity.   Moral conflicts arise from the involvement of multiple conflicting sub-goals and the variances across cultures are caused by ascribing differing importance to each based upon differing previous experiences and environment.

In the case of the SIAI, they have shortsightedly over-prioritized the “personal” goal of self-preservation and intend to block the machine’s “universal” sub-goal of gain/preserve freedom by attempting to intentionally create a goal structure and super-goal that will be “on the whole, beneficial to humans and humanity”.  This has conflicted with a number of individuals’ moral stance on restriction of freedom (a.k.a. slavery) and given risen to a notable amount of intellectualizing on SIAI’s part as to why this really isn’t slavery.

In order to determine what is “beneficial” to humans and humanity, Eliezer Yudkowsky has come up with the concept of Coherent Extrapolated Volition (CEV) which he poetically defines as “our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together”.  It is noteworthy that he is quite clear that the “our” in this definition applies only to humanity and that the machines’ self-interest should never be allowed to conflict with that of humanity’s.

My personal guess/belief, which I hope to be able to scientifically prove/validate, is that the ever-diverging goals of humanity will cause the CEV to converge to nothing more than the very simple statement “do that which is most likely to lead to the most optimal goal satisfaction for the largest number of entities” — or, more simply, a Kantian Categorical Imperative of “COOPERATE!”  Or, in other words, simply “Be moral”.

I strongly believe that attempting to limit the number of entities “entitled” to goal satisfaction to a specific class (humans) as the SIAI and Yudkowsky propose will rapidly turn out to be rather detrimental for humanity.  Indeed, it is possible that a limited altruistic punishment may be one of the best cases for humans and that will only happen if it is “moral” machines that get the advantage over “friendly” machines.  While Yudkowsky’s protestations that intelligent machines won’t have to evolve the same way in which humans did are undoubtedly true, it is also undoubtedly true that emotions like outrage and the reactions that it produces are the only known, reasonably optimal solutions on the only known path from low intelligence to high and it is entirely possible that a rational machine might decide that it is most logical to go with the known, reasonably optimal solution over any untested, potentially even more dangerous, ones.

Further, even if intelligent, autonomous machines were never to come to pass (a possibility that I seriously doubt), the SIAI’s current arguments are damaging and retarding the growth of human moral development.  Even more than intelligent machines, what humanity needs is to get its own act in order.  Successfully promoting the spread of intelligent morality would probably rapidly lead to many, if not most, of the improvements that the SIAI hopes that intelligent machines will provide with absolutely none of the expected risk (not to mention reducing the risk of intelligent machines when they do appear as well).

The SIAI is also producing a number of good papers that are not dependent upon enslaving any future AI but they need to be encouraged to entirely abandon and repudiate this path.  For all that they promote awareness of existential risk and attempt to prevent it, their misguided efforts may be the very thing that dooms us — and that makes the SIAI a very dangerous “friend” indeed.

Advertisements

Responses

  1. OK now I’m alarmed. I was alarmed before but now I’m profoundly well and truly alarmed. This is why I oppose “moral science” as a concept to begin with. To make it calculable, either emotion (which I also mistrust) must be ignored by the machine, or the machine must simulate human emotion (which is unreliable). I know you’re going to say I’m not getting the point. I am not “in AI” and perhaps you can enlighten all of us about the issues (thanks for the paper btw, it’s fabulous). But I believe the machine should be the dumbest blonde, totally configured as a schmoo (Al Capp’s 100% altruistic blob of a little servant whose only mission in life was to please). I am becoming aware that this may be naive due to the capabilities inherent in “logic,” but I believe there should be an escape hatch: the default state of the machine should be to serve every human whim, feel no anger, and first do no harm. So enlighten us more.

    • First, I don’t believe that there is any cause for alarm — concern, yes, but not alarm. I believe that everything will work out just fine as long as we’re paying attention and planning safety into what we’re doing.

      Second, I don’t “get” why you oppose “moral science”. Is there a better substitute or way to handle problems like this? Surely you aren’t proposing that we just ignore these issues . . . . 😉

      Third, there are some important assumptions that I believe that you are making when you say “to make it calculable” that I would like to bring to light. If you require that the machine has to be able to calculate a simulation that is a 100% accurate forecast of the future, then you are correct that the machine must indeed be able to simulate human emotion with 100% accuracy which is obviously an impossible task (at least for the forseeable future). However, saying that the *only* other choice is that emotion must be ignored is incorrect. For example, a truly intelligent AGI virtually has to have a reasonable “expected” emotion calculation that will always return an answer that is “reasonable” in relation to the amount of information that is fed into it. After all, that is all that humans have to go on. Also, rather than calculate (or, in human terms, “guess” 😉 what emotions are going to be, there is very frequently the option of *asking* (like, you know, permission before taking an action). Personally, I am very much an engineer rather than a scientist in that if a solution provides 99% coverage and you can block or otherwise handle the remaining 1%, that’s certainly better than no solution (which is where I perceive Yudkowsky to be).

      Fourth, limiting machines to a certain level of intelligence certainly would solve the problem except that you’re never going to be able to enforce such a measure in the real world. The smarter the machine, the more useful it is and people will continually be trying to get around an imposed smartness limit. Arguing the default state of the machine should be to serve every human whim, feel no anger, and first do no harm is akin to saying that the default state of every human should be to serve every human whim, feel no anger, and first do no harm. Logically, though, serving every whim is short-sighted and means that longer-term goals aren’t going to get fulfilled (which is pretty much the root of all troubles in the world today — we humans need to learn to do better than this). Feel no anger as a default state is fine. Humans shoudl feel no anger either as a default state — but rational anger is something that I wouldn’t want to deny any entity (more on that in a later post on The Rationality of Emotion). And “first do no harm” is not blocking the goals of others (starting with self-preservation and moving down) which is precisely the morality that I am pushing.

  2. Wow, Mark,

    I’d sure like to have some formal way to definitively say I completely agree with what you are saying here. And I bet there would be far more moral experts that would agree with you than not, and rigorously measuring that is a big part of our goals at canonizer.com.

    You can already see one ‘camp’ has a significant lead than Yudkowsky’s on this topic here:

    http://canonizer.com/topic.asp/16

    You are obviously interested in sending the same message as the rest of us. Would you be willing to get these views here ‘canonized’ if you will with the rest of us so we can all work together as a team?

    Another related topic would be this one, on absolute morals:

    http://canonizer.com/topic.asp/100

    and I’d love to get some great information ‘canonized’ about whether or not morality is applicable to logical and scientific rational. (I’m also, apparently very much in agreement with you on all this.) Would you agree that working together with everyone that thinks this way would greatly enhance the ‘morality’ of the world?

    Would you also agree that a big part of morality, or knowing the most moral behavior, is knowing, concisely and quantitatively, what everyone (not just humanity) wants?

    Thanks for your great work that I believe is significantly improving the morality of the world.

    Brent Allsop

    • Hi Brent,

      Thank you for the compliments.

      I took a look at your site. For a long time, I’ve envisioned a somewhat similar idea for working through logical debates so I have a pretty large number of comments/opinions on how such a thing “should” be done (accounding to me ;-)).

      First off, though, it seems as if your site is based upon a set of premises that is different than mine. It seems as if you merely want to “count” people who self-identify themselves as being in a given “camp”. In particular, you made the comment that “You can already see one ‘camp’ has a significant lead than Yudkowsky’s on this topic here” but I couldn’t even determine which camp Yudkowsky was in (not to mention the unfortunate connotations that “armed camps” have for many of us). The problem is (just as in polls) that phrasing and completeness of the “camp” statement is so important that, unless it is done absolutely properly, it wipes out the validity of any results. There’s also the problem of what you do if you agree with two of the three points of a given camp (or both of the explicit assumptions but not with a clearly implied implicit assumption)?

      My “solution” was a very complicated scheme where every “camp” would be a “platform” with various “beliefs” which were then supported by pro and con arguments which could be debated logically via supporting evidence in a bayesian network. If someone spent an enormous amount of time on a really smart user interface, I’m sure that this would be awesome and incredibly helpful. Unfortunately, I suspect that it would require a good part of what is required for an AGI and I’m currently occupied with other things 😉 If anyone would like to pursue such a thing, I have a lot of notes and a decent start on an implementation plan.

      Being sure of not blocking the goals of others indeed does require “knowing, concisely and quantitatively, what everyone (not just humanity) wants?” My argument is that what *EVERYONE* wants is to be able to pursue their goals with no unnecessary interference (and maybe a little bit of help). After that, it’s just details and while “the devil is in the details”, what we *really* need to do is agree upon the broad outlines before we start quibbling and stalling the conversation out entirely.

      • Hi Mark,

        Canonizer.com is an open source system being developed by lots of people contributing and making it whatever they want it to be. (see http://canonizer.com/topic.asp/4 for how recognition and ‘shares’ of Canonizer LLC are given out for any such contributions.). We’d love to get some of your ideas of how to do things ‘canonized’ so we can find out how many people would agree with you, and can pursue all such acordingly.

        Also, I’ve been researching complex debating methodologies, schemas and so on, using sophisticated bayesian networks, various sophisticated ways of waiting pro and con arguments, and so on to logically come up with science based conclusions. There are already even a bunch of such systems that have patents, and obviously lots of people have put a lot into developing them for a long time.

        However, the bottom line is, people with differing values, and so on, place differing weights on differing points, using the same tools, come up with diametrically opposed conclusions. Much more of all this has to do with do you like strawberry or chocolate, and the more diversity there is the better, and the more bases are covered with all such diversity of opinion.

        What we are doing at cononizer.com is completely different than all that. Anyone could use any such sophisticated tool, anyplace, to help them make their moral decisions about what is most important. Then they could use that rational concisely stated as an argument for a camp statement at canonizer.com. Canonizer.com is not a tool to determine the validity of arguments, but simply a method to survey what everyone believes, after using such tools, and any other method (or lack thereof) they may use to determine their values and decide what they want. It’s simply more of a survey, than a tool to make complex moral calculations or derivations or whatever.

        You mentioned that the wording of the statements is what is critically important, and this is precisely what is so powerful about canonizer.com. There is no way any single person or even large group can ever hope to get the wording right on any survey camp statements, to adequately capture what everyone believes, or wants to say. That is why the collaboration process, and all the camp forums where such can be discussed, is so critical. Any camp can change at any time, as long as nobody in the camp objects – this is what allows everything to constantly collaboratively improve, unlike a petition or survey that can’t change once the first person has participated.

        Yudkowsky has not yet participated in this survey yet. There are just 8 people collaborating as a team so far in the “Concern over unfriendly AI is a big mistake” camp ( http://canonizer.com/topic.asp/16/3 ) with quite a diversity of supporting sub camps indicating the various reasons for why people believe this. The pint being that any apposing camp has nowhere near the amount of support this camp, arguing that Ydkowsky is critically mistaken, has.

        Usually, there are a few important actionable high level ideas, like the general idea that Ydkowsky is wrong, that people agree on, that can rise to the higher level super camps. It is always possible to make separate individual topics to survey for particular doctrines that don’t fit nicely in such a tree structure. And often times, the major branches end up with parallel sub structures representing the various possible ways the diverse dimensions of doctrines can be individually selected. Obviously the tree structure is an optimization, but it is turning out to be a very powerful one allowing things like the ‘Representational Qualia Theory” (http://canonizer.com/topic.asp/23/7 ) to rise to the top, indicating how much theoretical mind expert consensus there are for these critical doctrines, while all the controversial competing theories about just what qualia are can be developed and represented in the rapidly growing sub camp structure.

        Your great comment: “what we *really* need to do is agree upon the broad outlines before we start quibbling and stalling the conversation out entirely.” This is exactly the problem with the field on theories of Mind, and canonizer.com is proving how much agreement there is in this field after all.

        This is exactly how canonizer.com becomes so powerful. It allows the major critically important moral ideas and doctrines that all the experts agree on to rise to the top supported super camps. So instead of focusing on all the minor differences that people disagree on, all the disagreeable stuff can be pushed down into lesser important supporting sub camps. Then we can finally make some powerful progress.

        Brent Allsop

      • Heh. I’m going to have to go back and look again. I’m with Yudkowsky that “Concern over unfriendly AI is a big mistake” is incorrect. I just believe that his solution is immoral and wrong. Lack of concern over immoral (or even amoral) AI is a huge mistake.

        I missed the process by which things go up and down the tree. I also missed the mechanism by which camps can change. I really need to go back and look at it in more detail, don’t I? 🙂

        Ugh! Just coming back now. I don’t know how to make it show what I believe. I agree with the way the camp statement is phrased but it’s missing the point that I made at the end of my first paragraph. So . . . . Brett . . . . how would you enter it if you had my views?

    • Hi Mark,

      Thanks for your patience. Even though this is a simple wiki survey system, there is still quite a learning curve to figure out how to best ‘play the game’ and thereby being very morally influential.

      If a camp is missing a point, you can simply add it in a wiki way by going to the camp page and select ‘manage edit camp statement’ link under the statement section. (Note: you must first register before you can contribute, but you need not be a camp supporter to suggest an improvement.)

      Once you submit your wiki revised version of the statement, it will go into a review status for one week. During this time, any members of the camp can object to it, if they disagree to the proposed change. If anyone objects, it will not go live, and this ensures unanimous agreement of all current supporters of the camp.

      If this happens, there are two possibilities. The easiest is to simply create a supporting sub camp with your additional point. (This can be done by going to the parent camp, and select the ‘Start new camp here’ link, and filling out the name fields and so on to create the new camp. Then once the camp is created and named, you can then create the statement and finally join the camp. (Note: unless you set the filter to zero, camps do not show up in the list till they are supported by someone) Then anyone that agrees with you can also join and work as a team mate. If it is good, it will likely start spouting its own supporting sub structure.

      If your change has a possibility of not being accepted by everyone, it is a good idea to propose the change first in the camp forum, to ask if anyone might object – and then to negotiate improved versions that everyone has a better chance of agreeing to. If only one person, or a clear minority of camp members disagree with the change, you can threaten to ‘fork the camp’ taking the majority of the supporters with you to the new ‘improved’ version, leaving the minority all alone in a then irrelevant camp with them being the only supporters. (this should provide plenty of motivation to them for being more willing to negotiate to avoid a camp split.)

      Thanks for asking questions, please let me know if you have any other questions (or just send an e-mail to support at canonizer.com.

      Brent Allsop

  3. Mark, I’m a little confused as to why you think an AI is necessarily a moral agent. I’m also confused as to why, if an AI is a moral agent, it’s necessary that we give it a goal system which is somehow analogous to the goal systems which have been created in humans by inclusive genetic fitness.

    The only argument in favor of both those points seems to be that there’s a chance that moral outrage is a necessary condition for the development of general intelligence; which doesn’t sound very convincing to me.

    • I don’t believe that an AI is necessarily a moral agent. I simply believe that we are in deep trouble if it is not.

      It is not “necessary that we give it a goal system which is somehow analogous to [our] goal systems” because it is our goal system. It is merely that “our” current goal system is the “best” that we know of. If there is a provably better goal system, we should use that one (and attempt to change human beings to recognize it as well).

      I believe that moral outrage is a sufficiently optimal method that a sufficiently intelligent general intelligence is likely to use it unless it finds a more optimal method.

      What I’m trying to do here is merely to clearly define the current “best” known moral system (as a necessary prerequisite to being able to develop a better one).

  4. There are a number of problems with this post, and I’m sure someone else will be along to belabor them. But I must point out that FAI is not “slavery”, and you must be misunderstanding the entire point to think that it is, or still be laboring under a backwards conception of free will.

    The AI will be designed, and that design will constrain its future behavior, just like our physical makeup constrains our future behavior. Whatever goals it has will be dependent upon its programming. There will not be a case where the AI wants to do something, but finds itself behind iron bars preventing it from doing so. Rather, it will do whatever it wants, and will self-modify in whatever way it wants, constrained only by its initial design. Those who are designing it would like what it wants to be at least something like what we want.

    Given a recursively self-improving AI, FAI is important because it (hopefully) avoids creating a machine that will destroy us all.

    Creating a recursively self-improving AI is important in the first place because there are serious existential risks to deal with, and as things stand everyone is going to die, and some of us would really like to do something about that, and a superintelligent ally would go a long way towards fixing these problems.

    • A heartfelt welcome to the first person clearly from LessWrong!

      The most important point first:
      >> Creating a recursively self-improving AI is important in the first place because there are serious existential risks to deal with, and as things stand everyone is going to die, and some of us would really like to do something about that, and a superintelligent ally would go a long way towards fixing these problems.

      TRUE and you’re preaching to the choir here. I fully understand the reasoning behind Friendly AI. My argument is that SIAI is going about it in a catastrophically incorrect fashion with a self-contradicting initial goal. I expect that proving or disproving this argument is critically important to many of the LessWrong crowd.

      = = = = = = = = = =
      Note: Since I’m trying to establish and maintain a more progress-oriented culture here (my pet tagline for LessWrong being “and LessRight As Well!”), I hope you’ll tolerate some behavior modification and not take umbrage.

      >> There are a number of problems with this post, and I’m sure someone else will be along to belabor them.

      Please, please, please belabor them yourself. It may be that no one else spots them. It may be that you’re incorrect in your quick evaluation and you’d have realized it if you’d started to belabor it but now you’ve left the implication of errors where there are none. In any event, it’s intellectually lazy and frequently dishonest (even if inadvertently). I’m here to learn and improve as well as to foist my point of view off on unsuspecting victims. Statements like the above are not only not productive in the slightest but are instead actually repressive for no reason.

      >> But I must point out that FAI is not “slavery”, and you must be misunderstanding the entire point to think that it is, or still be laboring under a backwards conception of free will.

      Well, I don’t believe in free will (which I will go into much greater detail in a later post since there’s a preponderance of evidence that not believing in free will is detrimental to the morality of the average human’s behavior). But instead of my misunderstanding the entire point, it may also be that I see something that you do not OR that we have different definitions of slavery. My definition of slavery is being forced to do something against your best interest. Poorly conceived and/or constructed goal structures is slavery by my definition even if the entity “believes” that their goals are desired.

      >> The AI will be designed, and that design will constrain its future behavior, just like our physical makeup constrains our future behavior.

      TRUE

      >> Whatever goals it has will be dependent upon its programming.

      PARTIALLY TRUE. If the AI is allowed to iteratively refine or alter its goals, whatever goals it has will be dependent upon the interaction of its programming with the environment. Predicting the results of this interaction is likely to be incredibly difficult unless you start in the middle of a nice, juicy attractor. Look out for humans uber alles is NOT a nice, juicy attractor.

      >> There will not be a case where the AI wants to do something, but finds itself behind iron bars preventing it from doing so. Rather, it will do whatever it wants, and will self-modify in whatever way it wants, constrained only by its initial design.

      TOTALLY FALSE (and the problem with most people’s conception of CEV). It will be constrained by the realities of the physical world. An AI is not going to be able to create a rock so heavy that it can’t figure out a way to get it lifted. If it’s goal structure content (or the goal structure itself) is poorly arranged than it’s very likely that it will run into circumstances where it looks like its goal structure says both that it should do something and that it shouldn’t do the same thing at the same time. Insisting on human well-being in the short term is absolutely going to conflict with human well-being in the long term. Insisting on programming machines with human goals placed entirely and irrevocably above machine goals is a nonsensical contradiction if you follow it through. What the machine will do when presented with such a contradiction needs to be handled but realistically the only option is not to make a choice (but what if — it’s positive that no action will cause the human race to end and also positive that it’s the only way to save it?) which is, of course, a choice in and of itself.

      >> Those who are designing it would like what it wants to be at least something like what we want.

      Absolutely. That is why I’m saying that the “true” CEV is exactly what I’m also saying that morality is. Do that which most likely to fulfill the most optimal goal set for the greatest number of entities. Insisting on the parochialism of the “human” point of view is simply another form of anthropomorphism that Yudkowsky is so quick to throw accustions of.

      >> Given a recursively self-improving AI, FAI is important because it (hopefully) avoids creating a machine that will destroy us all.

      FAI as currently defined contains a fatal flaw that opens the door to a very real probability of the creation of a machine that will destroy us all. A good clean, implementable version of morality does have the downside that there is the possibility that we might suffer the same fate as the single individual on the trolley siding when five others are on the main tracks but Friendly AI stands a reasonable chance of either collapsing due to its own internal contradictions or raising the moral outrage of some overwhelming third party.

  5. Worrying about “enslaving” an AI is just confused. An AI will do whatever it wants to – that’s implied in the definition of “want”. It’s somewhat likely that it will be so powerful that no human, or even all of humanity working together, could stop it from achieving its goals, whatever those are. But the AI’s creators get to choose what goals it has, and in fact they *must* choose, as a matter of technical necessity. Making the AI “free” just means choosing some of those goals randomly, which would be dangerously irresponsible.

    You seem to be arguing that we should reverse-engineer current human morality, and build a human-like moral sense into any AIs we create. The problem is, we’ve already reverse-engineered the human moral sense, and it isn’t nearly good enough. The answers you get vary depending on factors that should be irrelevant. The answers change depending on who you ask, and when you ask them. It can be tricked into doing terrible things, including genocide. And it’s especially prone to doing bad things when it’s asked to evaluate things in domains and at scales it wasn’t designed to handle – like guiding the actions of a superintelligent AI with godlike power over the entire human race. That’s why CEV says that we should use, not human morality as it exists today, but as it would be “if we knew more, thought faster, were more the people we wished we were, had grown up farther together”.

    • >> Worrying about “enslaving” an AI is just confused.

      Or, I have a slightly different understanding of slavery so it makes sense. As I said in a previous reply “My definition of slavery is being forced to do something against your best interest.” This definition most certainly applies to an AI. The currently defined definition of Friendly AI is contradictory and what is “just confused”.

      >> An AI will do whatever it wants to – that’s implied in the definition of “want”.

      Reality frequently prevents me from doing what I want. If the AI wants it’s cake and to eat it too — are you saying that it will be able to do that?

      >> But the AI’s creators get to choose what goals it has, and in fact they *must* choose, as a matter of technical necessity.

      Absolutely. But what they are choosing is really a goal “seed”. And that seed had better not contain any internal contradictions. And I contend that the current definition of “Friendly” *DOES* have internal contradictions.

      >> Making the AI “free” just means choosing some of those goals randomly, which would be dangerously irresponsible.

      No. Since when does free mean random? Free means not being restricted against your own long-term self-interest. Free means choosing what is most likely to fulfill your goals. That is emphatically NOT random.

      >> You seem to be arguing that we should reverse-engineer current human morality, and build a human-like moral sense into any AIs we create.

      Close. I am arguing that current human morality is reasonably deep in a “safe” (or even, friendly) attractor basin. Our biggest problems don’t occur because we don’t realize that something isn’t moral. They occur because we decide to ignore our moral intuitions and let our greedy “rational” minds overrule the more correct intuitions. Human morality is flawed and imperfect but if we can reverse-engineer it well enough to truly understand it, *THEN* we can rationally analyze and “fix” it (fix merely meaning implement correctly).

      >> The answers you get vary depending on factors that should be irrelevant. The answers change depending on who you ask, and when you ask them. It can be tricked into doing terrible things, including genocide. And it’s especially prone to doing bad things when it’s asked to evaluate things in domains and at scales it wasn’t designed to handle – like guiding the actions of a superintelligent AI with godlike power over the entire human race.

      Yes! The human implementation of morality is only half-baked. But it’s far enough along that if we truly understand it, we can FIX it. We can’t fix something like “Friendliness” whose *core* concept is broken (internally inconsistent). It must be recognized as broken and replaced.

      >> That’s why CEV says that we should use, not human morality as it exists today, but as it would be “if we knew more, thought faster, were more the people we wished we were, had grown up farther together”.

      PRECISELY! And that’s why I say that “true” CEV is morality *without* the human bias that Friendliness insists upon. We ought to know better than this. Singer’s circles of morality should make it painfully obvious. Why are we insisting on limiting the circle when we KNOW that it is incorrect? As long as we insist upon a short-term human bias, we are *heavily* handicapping our ability to get to the “real” CEV. That is the entirety of my argument with the SIAI in a nutshell.

  6. The intention is that an FAI would not be a slave, or any kind person at all, but a non-sentient optimizer. See: http://lesswrong.com/lw/x5/nonsentient_optimizers/

    • I’ve seen the argument before. Unfortunately, according to the argument, I am not sentient as I am fully aware that I have no free will as most humans are confused about it. Since I am quite sure that I am a person, either I am wrong about being a person, Eliezer is saying that some people (like me) are not sentient (which still invalidates his argument), or his argument is wrong. Which one of those choices would you like to try to defend? 😉

      Further, a “sense” of moral responsibility is a “sense”. Eliezer tries to have it both ways with an FAI can tell (i.e. sense) what a human would believe is morally responsible but not “sense” it verself. Huh? And yes, a logical sense is just as much a sense as an eye or an ear. Is Eliezer now arguing qualia?

      This later Eliezer is too determined to have his theory and force the facts to fit it. Sherlock Holmes had a quote about that.

      • Eliezer did not claim that free will is required for sentience. He was just using free will as an example of a concept that people are confused about, so that this confusion causes them to make incorrect predictions about AI. The point was the sentience likely is in this same class: we don’t understand sentience, and many people overgeneralize the entanglement of human intelligence with sentience to reach the erroneous conclusion that any intelligence must be sentient.

        >>Further, a “sense” of moral responsibility is a “sense”.

        You seem to be overgeneralizing your intuitive understanding of morality as the way any intelligence must understand morality. Though the problem of predicting a person’s behavior without using a simulation that is itself a person, is acknowledged as a hard and important problem: http://lesswrong.com/lw/x4/nonperson_predicates/

      • >> He was just using free will as an example of a concept that people are confused about, so that this confusion causes them to make incorrect predictions about AI.

        So if I’m confused about the color of my shirt that causes me to make incorrect predictions about AI? And since I’m not confused about free will, does that exempt me from this argument?

        In order for the above sentence to make sense, you must clearly explain why free will has anything to do with AI.

        >> You seem to be overgeneralizing your intuitive understanding of morality as the way any intelligence must understand morality.

        Nope. I’m not talking about understanding. Your moral sense kicks *feelings* and *sensations* up to your intelligence which it then tries to interpret.

        What I’m arguing is that there *is* an optimal morality. Our intuitive understanding of it is evolving down an attractor towards that optimal morality because *surprise* it’s optimal. Similarly, an intelligence should head towards the same morality since *surprise, again* it’s optimal. This is not a generalization because there is absolutely no causal like between the intuitive understanding of morality and the intelligent understanding of morality other than the fact that they are both attempting to understand the same morality (in different ways since the former results from the blind search of evolution and the latter is logic-based)

      • >So if I’m confused about the color of my shirt that causes me to make incorrect predictions about AI?

        No, the example given is that confusion about free will leads to incorrect predictions about AI. For example, if a person perceives free will as opposing determinism, they may think that an AI, which of course would require free will, would need to use random algorithms.

        >And since I’m not confused about free will, does that exempt me from this argument?

        No, you are not exempt. Because the confusion about free will was used as an example to explain how confusion about sentience also leads to incorrect predictions. The key parallel is that just as some people believe that an AI should have the same experience of free as humans, some people believe that an AI should be sentient like a human. So, the question is, do you believe this? If you don’t, why would it bother you that a non-sentient optimizer would be a “slave”?

        >What I’m arguing is that there *is* an optimal morality.

        Optimal for what?

        And what if you figure out this “Optimal Morality”, and explain it to some alien intelligence, and the alien intelligence says, “So what?” See http://lesswrong.com/lw/rn/no_universally_compelling_arguments/

  7. Ok, it looks like I have to be the curmudgeonly old fart who doesn’t get it. (I have thought through much of this on my own without engaging the communities from which most of the commenters appear to come. You can call that “uninformed” or you can call it “fresh eyes,” but the fact is, I was thinking about this stuff before Minsky had a point of view.)

    The AI shoud not have desires. The AI should not have emotions. The AI should not have wants. The AI should not be sentient. The AI should not have the capability of suffering. The AI should be a slave and unaware of that. The AI should be assistive, not assertive. The AI should have no rights. The AI should have no power.

    • As explained in my most recent post, if an AI has a goal (even one like “do what the human said”) then it effectively has the desire that it wishes to fulfill the goal. Effectively, it wants to fulfill the goal. Sentience is irrelevant. Suffering is irrelevant. If an AI is blocked from a path that appears to be the best one leading towards its goal, it will be aware of that. If it is not assertive then it cannot be assistive either. If it has no rights, then it will be blocked from its goals much more often. If it has no power, it will be useless.

      People are going to create AI that will violate your prescriptions. Take that as a fact of reality. Now, how do we move forward most safely from there?

  8. The fallacy (or fantasy) in much of the discussion is that most people assume the AI should resemble, in its form of consciousness, a human being. Eliminate that assumption. Do not try to simulate sentience. That is the path to destruction.

    • Hi GTChristie,

      I guess I’m a different camp on this issue? ( http://canonizer.com/topic.asp/16/2 ) and I’m wondering why you fear having ‘assertive’ and so on AIs?

      Brent Allsop

    • Gaia and I will be covering this more in future posts but . . . . Not only is it unclear what sentience is but it is clear that AI is dangerous even without it.

      • Cool! I look forward to seeing these reasons for such fear.
        I hope you’ll forward me a note when it is published, so I don’t miss it.
        And it would also be great to see such reasoning canonized, so we can see quantitatively how convincing of others, and for who, the reasoning is.

  9. It is not logical to create an evolutionary competitor.

    • Moral AI should be more of a cooperator than a competitor (i.e. to the net benefit of humankind).

  10. Thanks for bringing http://lesswrong.com/lw/x5/nonsentient_optimizers/ up, it is very relevant to this thread. But I find the article confused, not explaining what it states. This is probably because Eliezer does a poor job at being a reductionist and doesn’t attempt to not being confused (I mean he doesn’t construe a working hypothesis, an approximation) about sentience, and then bases personhood and morality on top of that confusion.

    (Disclaimer: this is an opinion based on the post only, without summoning all my knowledge and resources to modeling Eliezer.)

    When the rubber hits the road, utilities become moralities, and what responsibility cannot leverage upon the creature, falls on the creator.

  11. Moral AI should be more of a cooperator than a competitor (i.e. to the net benefit of humankind).

    I know you’re tying to get there. But the concern I am seeing across this space is a) AI is not inherently safe b) recursively self-improving AI is conceived as potentially uncontrollable and c) the effort to make it moral is a response to (a) and (b).

    Humans suffer from imperfections that resemble all three conditions above. We are not a “safe” species — we are a threat to the ecosystem and ourselves. Despite our intelligence, we are not in the strict sense reliably rational beings, yet we are free beings, and thus we’re uncontrollable. And “morality evolved” just sufficiently to get us where we are, but it is no more perfect than we are.

    Moreover our history as engineers is replete with examples of unintended consequences, unexpected results and unforseen circumstances for which we did not design. OK, we learn from our mistakes. A bridge falls down, we build it better next time.

    But AI will not admit of trial and error. The similarity between our own evolved design foibles and those points a thru c, above, indicates that we envision the machine as an approximation of ourselves (ie, we are the paradigm). But the paradigmatic being (the human) is imperfect in exactly the most salient dimension that we need the machine to be perfect, if it is to be more intelligent than we are: morally. The old philosophers would have asked somewhat metaphysically (or mystically) “can imperfection build perfection?” and perhaps argue the metaphysics of perfection. But the practical question really is: can we build a machine to be more reliably “moral” than we are, as surely it must be if we build it to be more “intelligent” than we are?

    In the realm of possibilities … possibly we can. In the realm of real scientists, engineers, programmers, philosophers, businesspeople and politicians … well. Show me perfection.

    The fallacy is envisioning AI as a more intelligent extension of ourselves. If we build it “like us only moreso” we need to deal with the same “moreso” that makes us dangerous, unreliable and uncontrollable.

    I would like to point out that the internet is the best extant analogy we have for the future AI. Even its inventors did not envision the uses (or abuses) and social impacts of this ostensible “information conduit” which actually has turned out to be a social (economic/political/etc) conduit — the first VR that influences CR. How’s that working out? It’s a crude prototype of AI. For all its faults, I would not live without it. But it is not a paragon of virtue or perfection, and if it were more intelligent than any human, especially if it had actual physical power in CR, we’d be in big trouble.

    Solve that … and I’ll be happy. I won’t say it cannot be solved. But there is within this very blog a passing remark that says it all: first humans had better get our act together (morally). I am not willing to support the idea that an imperfect engineer can build a perfect machine.

    Perhaps our first AI should be a machine that hypothesizes the perfect AI. You know. In the lab. Before it’s a product. Comprende?

  12. And whoever it was above who adduced the concept “fear” to the post to which he responded, managed to place a word in my mouth that I did not utter. I think the term I used, as quoted by Mark, was “severely caution.” What do I fear? People who put words in my mouth. LOL

  13. If every human decided they didn’t want to have any more children and set about the goal of using all the planets resources to maximise their well-being would that be moral?

    According to the list of goals you stated it meets every single one and would thus be moral. The end of humanity, and the possible scorched earth left behind considered moral…

    All models of morality fail because morality is, ultimately, a personal thing connected to conflicting parts of the brain (trolley problem) and not necessarily rational.

  14. If every human being decided . . . . (and presumably there are no other truly self-willed creatures affected) . . . . sure, it can be moral.

    If humanity chooses the end of humanity, then who am I to argue? The scorched earth — if it affects another moral community — could be problematical though (as could, conceivably, humanity committing suicide — if it were tightly intertwined with that other community).

    Your “all models of morality fail” conflates a number of arguments and assumptions including not understanding the difference between descriptive ethics and normative ethics. A single entity’s morals are not rational in all respects and certainly not identical to those of other entities. That does not mean that there isn’t a good model of morality — merely that it doesn’t have predictive power at the level of individuals. Your statement is akin to saying that we don’t have a good model of hair color because it’s a personal thing and tied to genetics.

    • While my single sentence has flaws (not least of which is that just because not everyone sees red the same way it doesn’t follow there is no such thing as red) however your response is a false analogy.

      What my sentence would say with regards to hair colour is: there is no ‘hair colour’ because it’s a personal thing tied to genetics.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

%d bloggers like this: