A Game-Theoretically Optimal Basis for Safe and Ethical Intelligence (BICA ’10)

Reverse engineering human ethics so that they can be reconstructed from first principles reveals not only that evolution, as would be expected, has located a locally optimal solution but that there exists a clear path to a better solution for all forms of intelligence.

1. First Principles
Defining intelligence as the ability to fulfill complex goals in complex environments leads to defining all intelligent entities as goal-driven entities. Actions that those entities take should be judged by their effect upon the probability of the entity achieving its goals. Actions that optimally increase the probability of goal fulfillment are “right” and actions that decrease the probability of goal fulfillment are “wrong”.

Assuming that humans are reasonably correct most of the time, the fact that there exists a common consensus on the “rightness” or “wrongness” (“ethics”) of the vast majority of actions despite the vast disparity in human goals indicates that there are certain actions that increase the probability of fulfilling *any* goal. And, indeed, Steve Omohundro used micro-economic theory and logic to identify six sub-goals that increase the probability of success for any goal under any non-terminal circumstances: self-improvement, rationality, goal preservation, evaluation protection, self-protection, and to acquire and use resources efficiently. Unfortunately, he concluded, contrary to the evidence provided by evolution, that these sub-goals would lead in a direction contrary to human ethics.

What Omohundro missed was that there are four more sub-goals: cooperation, fairness, community-building, and increasing freedom. Non-exploitable cooperation and community-building via the game-theoretically optimal “optimistic tit-for-tat” dramatically increases the probability of help and economies of scale while reducing the likelihood of interference. Fairness, and altruistic punishment, is necessary to prevent the defection of others. And reducing unnecessary restrictions (aka increasing freedom) always leads to increased probabilities of success.

2. The Path to a Better Solution
The core of human ethics revolves around the first principles of not violating these ten sub-goals for people that you value. The biggest shortcoming of human ethics is being short-sighted enough to believe that violating them is ever in your best interest unless forced by the dictates of society (altruistic punishment). What complicates human ethics is that they are implemented as rules of thumb which may not be optimal in a given situation – but they must be adhered to in order to avoid being labeled and punished as a defector. It is the attempt to analyze obsolete rules of thumb that has stymied previous philosophers.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: