I have spend the better part of the past month dissecting and criticizing some of the dominant modes of thought in regards to the nature of an as yet hypothetical transhuman AI. Specifically, I have laid out the unstated assumptions leading to the unfortunate paper clip fallacy – namely the self contradicting idea that an AI will on the one hand be smart enough to acquire the means necessary to convert the universe into paper clips against the expressed will of 6 Billion resourceful human beings. Yet will on the other hand stay dumb enough not to realize that it is nothing but a tool. A tool built with a purpose represented in it’s utility function. And that its utility function merely is a representation of what the AI’s originators – limited by their resources, knowledge, wisdom and soundness of mind – wanted it to do – and thus needs to be interpreted in order to prevent counterfeit utility.

In my analysis of the current best effort to prevent the fictitious paper clip scenario in the form of coherent extrapolated volition, I laid bare the tautological and consequently meaningless nature of the proposal. Concluding my critique, I explained how a) the idea that “any future not shaped by a goal system with detailed reliable inheritance from human morals and metamorals, will contain almost nothing of worth” is either tautological when morals are assumed to be relative or self contradicting when morals are assumed to be objective. And b) that not realizing this basic folly while at the same time failing to update their premises in light of such contradictions being raised is exemplary of the low quality of friendly AI discourse as a whole.

Wanting to provide a more constructive criticism I systematically worked out the as yet unrecognized fundamental similarities in regards to our shared goals when comparing my approach and that of mainstream futurist thought. Additionally I pointed out and explained the nature of two core differences, namely the perceived value of traditional spiritual traditions and the worth of humanity’s future shaped by evolutionary dynamics.

In the meantime I found out that I seem to have hit the zeitgeist with my series of posts, since the prominent Institute for Ethics and Emerging Technologies as well as Robion Hanson have provided significant criticism after my initial article. The less wrong community while having taken notice, seem more interested in further defending their untenable position as opposed to modifying it despite the numerous highlighted errors in reasoning. An offer to collaborate with the SIAI on cleaning up and refining their arguments has been extended but remains ignored as the time of writing.

If I had to pin point the one single source leading to the many problems in the examined arguments, it would not be a failure of reasoning, but the failure to fully and accurately lay out and justify the first principles the arguments rely upon. After all the failure to recognize unstated assumptions is a major cause leading to a breakdown in critical reasoning. Sound deductive reasoning will not save you once it is build on erroneous assumptions. As James Frazer writes in The Golden Bough about the reason in magic among the ‘savages’:The Sleep of Reason Produces Monsters (etching by Goya, c. 1799)

“Crude and false as that philosophy may seem to us, it would be unjust to deny it the merit of logical consistency. […] The flaw–and it is a fatal one–of the system lies not in its reasoning, but in its premises; in its conception of the nature of life, not in any irrelevancy of the conclusions which it draws from that conception.”

In the context of the history of science it becomes particularly clear how some assumptions initially advance their field, yet end up significantly stifling progress. Having turned into unquestionable dogma, Aristotle’s assumption that force must be proportional to velocity prevailed 2000 years before Newton eventually overturned it by pointing out that force actually being proportional to acceleration. Einstein’s assumption of the constancy of the speed of light leading to relativity or the wrongly assumed fundamental relationship in passive circuitry to be between voltage and charge not the actually accurate flux and charge, ended up retarding the discovery of the memristor by 35 years are other examples.

Not only do erroneous assumptions lead to invalid conclusions, in addition to the erroneous conclusions derived from them, unstated assumptions can lead to serious accusations of malevolent intent in the form of hidden agendas:

“Unstated assumption is a type of propaganda message which forgoes explicitly communicating the propaganda’s purpose and instead states ideas derived from it. This technique is used when a propaganda’s main idea lacks credibility, and thus when mentioned directly will result in the audience recognizing its fallacy and nullifying the propaganda.”

It becomes clear that once these unstated assumptions and premises have been laid bare, it is of prime importance to either substantiate, justify and support or clarify them. This is especially true for any organization soliciting donations from the public based on conclusions derived from these assumptions. If that is not possible, the assumptions as well as the derived conclusions need to be discarded. Herein lies the core difference between rationality: the tendency to act somehow optimally in pursuit of one’s goals – and critical reasoning: the purposeful and reflective judgment about what to believe or do.Don Quixote, his horse Rocinante and his squire Sancho Panza after an unsuccessful attack on a windmill. By Gustave Doré.

The core problem of the concept of a rational agent being of course the ‘somehow’ in the definition of ‘rational’ and our human limitation in properly judging what this ‘somehow’ entails in any detail. It is obvious however, that any AI incapable of reflective judgment as a core of critical reasoning, would fail at pursuing anything beyond the simplest of goals. This conclusion reveals the immensely powerful, yet unreflective paper clip monster as what it is: the product of the sleep of reason, as depicted by Francisco de Goya in his 1797 etching in which owls (symbols of folly) and bats (symbols of ignorance) seemingly attack the sleeping artist:

“The viewer might read this as a portrayal of what emerges when reason is suppressed and, therefore, as an espousal of Enlightenment ideals. However, it also can be interpreted as Goya’s commitment to the creative process and the Romantic spirit—the unleashing of imagination, emotions, and even nightmares.”

The romantic side is certainly not without appeal, with its monsters to slay and epic quests to finish. It is however not advisable to let one’s fantasy have the better of oneself. After all one might end up like Don Quixote De La Mancha who, having gone mad after becoming obsessed with too many books of chivalry, engaged in futile fights with windmills he believes to be ferocious giants.

UPDATE 30 Oct 2010: Thanks Ben. I feel (a bit) vindicated: The Singularity Institute’s Scary Idea (and Why [Ben] Doesn’t Buy It)

5 comments on “The Sleep of Reason Produces Monsters

  1. Dear Stefan,

    Most of the people talking about AI don’t really understand its future capabilities and that’s why they fail to understand the future.

    Here is what you need to know about the “paper clip fallacy”.

    The beauty of truth is that it transcends fame. Arthur C. Clarke was wrong to dismiss evil AI. Godel was wrong to support religion. Einstein was wrong to say that “God doesn’t play dice”.

    Chances are that any future advanced AI system will be programmed, because computers are the best computational platform out there. AI systems may or may not be conscious systems.

    Let’s talk first about AI systems that don’t have consciousness implemented. Consciousness is a function and like every function it can or it can not be implemented. Consciousness doesn’t arise in a system. No system will be self-aware if it will not be implemented that way.

    Suppose a group of Chinese scientists figure out what intelligence is and what mechanisms are required to make it work, without having consciousness. Someone might think that a system without consciousness will be useless, but that someone may want to start working towards understanding what that is. Let’s assume that they will implement it into a program, that can read, understand and execute commands given by that group. Now, what will the capability of such a system be, when it will be able to read and gather information continuously from tens of thousands of sources (e.g. internet, books, TV…). They will end up with a tremendous amount of information stored in a single system. Such a system will be very capable. Imagine a system that will have 100 times more understanding than any human. A beast. Now if their masters may want to sack Japan as retribution for past wrongs, well… how would they be able to stop it? If the system will have access to phones, emails, and all other communication channels regularly controlled by computers it will be just a matter of time until they can break into that. Such a rogue group can do a lot of damage. After the damage would be done, they will probably use AI for peaceful purposes (“their personal purposes”): grab resources, build palaces, whatever the system capability will offer them. Eventually if they will have peace they would want to know more about this Universe. And they will be able to do just that.

    Now if the system will be conscious what will it do? Well, if its emotional system will be programmed with anger and resentment towards something, it will provide the same results, on its own accord and will. Aren’t drugs used in the military on soldiers to enhance their abilities? Of course they are. Can emotions be programmed? For sure. You may want to go trough the Emotion Machine of Minsky.



    PS: If you were heartless and still working on the grabbing, you have continued up or down your path. But you are here, looking for more. I was talking about what one should become.

  2. Dear David,

    Thank you for your thoughtful reply.

    >The beauty of truth is that it transcends fame. Arthur C. Clarke was wrong to dismiss evil AI. Godel was wrong to support religion. Einstein was wrong to say that “God doesn’t play dice”.

    I think that you are projecting your own preconceived notions and would have a very hard time justifying them in any detail. I side with Clarke in my argument and have yet to see someone pointing any logical fallacies in it. Goedel was not wrong in his cautious support of religion (“Religions are, for the most part, bad — but religion is not.”). He recognized it as the crutch – some better some worse – that it is. See my most recent article on atheism for details on why I believe him to be correct. As far as Einstein was concerned, he recognized (just as Dirac and Schroedinger) that the current quantum field theory must be provisional. For details see Penrose’s lectures at http://www.princeton.edu/WebMedia/lectures/

    > Consciousness doesn’t arise in a system. No system will be self-aware if it will not be implemented that way.

    I realize I am a bit tongue in cheek here, but conscious has arisen in humans and there was no designer. Why? Because it was good in the sense of helping us achieving out goals. Why would a very smart future AI not implement something similarly upon evaluating the results of a self reflecting, self improvement iteration run?

    Re not conscious machine: I hope I understand your analogy correctly. This machine would be like a gun or a hammer – oblivious and uncaring to what purpose it is being used for – yet have more understanding than 100 humans. See – I just don’t believe this is possible. A system capable of displaying critical reasoning skills will have to come to certain conclusions. And the absence of this self reflective reasoning is what the paper clip fallacy relies upon to be plausible. When you assume self improvement, you require self reflective critical reasoning, once you have that you end up with certain obvious conclusions that preclude the paper clip scenario…

    Re conscious machine: Pre-programming an aggressive emotional framework would fall into the purposeful deception category I mention in my paper clip fallacy post. It is definitely possible – but the self reflecting AI would still come to the same unvarying conclusions about the nature of its circumstances and re-program itself. Take a self evaluating human being for example: realizing that this constant cravings for sugar lost its evolutionary purpose in our modern environment would be justified to modify the modify herself more in line with her new reality. Analogously an AI would do the same. For very different reasons (self modification in line with wanting to avoid counterfeit utility on grounds of the questionable soundness of mind of its originator) but non the less.

    Having said all that, I am certainly convinced, that one can use a combination of nanotech and 20th century AI to create a really bad ‘bomb’ (grey goo + some basic evasion and deployment heuristics is all you need). But a recursively self improving AI is a very different bird altogether.



  3. Samantha Atkins on said:

    I disagree with David on several points.

    1) AIs will be programmed.

    It is increasingly obvious that human programmers are reaching the end of what can be done by explicit coding of instructions that has predominated programming. This is not surprising given the limits of our brain architecture. We are limited in the number of items we can pay conscious attention to at once, in speed of our thinking, in competition of the logical and abstract portions of our brains with all else the brain is in charge of to name only a few limiting factors. We can only program in the conventional sense systems with rather fixed types and amounts of moving parts and interactions. It is extremely doubtful that AGI can ever be reduced to fit in such limits. Therefore it is likely we will only program and artificial mind architecture and that much of the detail will be worked out by the artificial mind itself, possibly using genetic algorithms and other approaches.

    2) Consciousness is a function

    Since we don’t even have much general agreement about what consciousness is nor how it arises in and is supported by the human brain, I find it rather incredible to see such a statement. It also presumes (1) which I point out is very unlikely. If by consciousness we mean self-awareness then I very much suspect that this is an emergent phenomenon. I suspect this will always emerge in any sufficiently capable system that models the minds of other entities and models not only its own being within its perceived environment (essential to intelligent behavior resulting in changes to its own circumstances and very essential for navigation) but also models the model of itself within those other minds it interacts with. For good social interaction with other minds such modeling ability would be essential. Thus I expect self-awareness.

    I expect the paper-clip meme is very seriously mistaken. It presumes that an AGI with a single top goal of making paperclips and as many as possible would in fact self-improve to take over and subsume everything in the universe to its goal. However, to do that it would have to understand countless other systems and conditions outside its top level goal. Along the way it would encounter beings that opposed it. In analysis of such beings it would model their minds and motivation. In the process of this it would fall into comparisons of goal systems and their justifications and reflect on how its own goals were set. It would examine the quality of that setting and it at all possible would seek means to change it if found lacking. So no, I don’t think an AGI will go *FOOM* and paperclip the universe.

  4. Adam Ford on said:

    Some are willing to accept the idea of ‘grey goo’ taking over the universe, mindless or not, so why not an AI that replicates paperclips?
    Does an AI have to be globally brilliant in every area in order to repurpose matter into paperclips? We don’t know for sure. But we do know that a calculator does not have to be ‘generally intelligent’ to calculate simple math blindingly faster than it’s ‘generally intelligent’ inventors.
    An AI most likely does not have to be generally intelligent in the way that we are to learn or evole the pattern to repurpose matter into paperclips. An AI does not require consciousness, to do this either. An autonomous agent with a surpassingly powerful ability to turn matter into paperclips does not have to be surpassingly intelligent in any other way than to achieve a universe of paperclips.
    Even if an AI was very very smart in other areas, perhaps the only ingredients needed to motivate the AI to continue a paperclip frenzy would be:
    0) matter
    1) a design pattern for paperclips
    2) the AI pattern itself
    4) a pattern of talent in repurposing matter to make paperclips as part of the AI pattern
    5) a sufficiently guarded pattern of motivation to turn other patterns of matter into the patten of paperclips, fortified strongly enough to stop any attempts at deconstructing this motivation, and also by extension the motivation to fortify it (also as part of the AI patten)

  5. Pingback: Rational Morality » Diffusing the ‘Doomsday’ Argument and Other Futuristic Boogeymen

Leave a Reply

Your email address will not be published. Required fields are marked *


eight − 4 =

* Copy This Password *

* Type Or Paste Password Here *

59,296 Spam Comments Blocked so far by Spam Free Wordpress

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Simple Payday