More Subtle Objections to Jamais Cascio’s Views Monday, Nov 16 2009
friendly ai 12:59 pm
Brian Wang of Next Big Future chimes in on Jamais’ NYC Future Salon talk:
I disagree that there has been or will be much effective overall control of the result of technological development by governments and people. Also, most of the efforts tend to be reactive and take a long time to build a coalition to effect policy. Even when there is regulations and laws – people still break laws and often regulations do not control or prevent what they were supposedly intended. An example was the introduction of the Sarbanes Oxley rules in response to the Enron scandal. Yet those rules did nothing to mitigate or prevent the banking and credit problems.
There are people who criticize the Singularity Institute for working on a technical problem of developing Friendly AI instead of say thinking about political and social responses. This makes no sense to me. Would these same people have said that the Y2K computer problem people should focus on the political and social responses instead of organizing IT departments and programmers to fix the computer programs and to develop tools to make that process easier and more reliable ? The singularity Institute is concerned that computer programs and computer systems made for AI will end up having the technical problem of not doing what we want in general and being dangerous. It is perfectly valid to work on a set of technical programming and algorithmic and mathematical attempted solutions.
Just like the Y2K bug, maybe the problems will not be as bad as feared. the efforts to resolve and mitigate the Y2K bug probably did help. Friendly AI is a far tougher software design problem than fixing Y2K bugs in IT systems.
It is also clearly tougher to convince people that advanced AI is a potentially serious issue. Y2K was a very simple thing to explain – we have built most of the dates with only two digits for the year and the programs will error out or misbehave if we do not fix it. There are critical systems like power plant, military and medical software and emergency services software which you do not want to have down or operating improperly.
I still don’t really understand how Jamais views the Singularity, and I’ve read all his articles and watched his hour-long talk. It seems that he is uncomfortable with the idea of humans being surpassed by any entity whatsoever. In his talk, he essentially says that he would consider it a dystopia if we got advice from superintelligent machines — he thinks it would be sacrificing our autonomy. But, if good advice is being offered, does it really matter if it is coming from a human, an AI, or some graffiti on a bathroom wall? Good advice is good advice. Receiving advice and help from superintelligence (which presumably would share at least some of our values, and probably be more humanlike in that respect than the cold stereotypes in science fiction) would increase our freedom, not decrease it.
Humans will not be the #1 entities on the planet forever. Big deal. Jamais complains about the idea of an AI singleton, but humans will simply be too slow to respond to all the possible planetary threats in a world of advanced AI and MNT, so some superintelligent coalition will have to deal with it, and they will make the decision, not us. If we assume that qualitatively higher intelligence is possible, as I think there are strong scientific reasons for believing, then the comparative value of unenhanced human decision-making will fall. So what? The alternative is staying with a human-only society forever. Is that what Jamais wants?
When it comes to issues that really matter, humans will eventually be viewed as dumb to superintelligences. Keep in mind that superintelligences might derive from humans rather than AIs, but even a superintelligence only smarter than us as we are smarter than Homo habilis would still be a massive difference. I can imagine Jamais busting into a conference room of superintelligences communicating gigabytes per second of information to each other and manipulating concepts more complex than the human mind could ever hope to handle, and shouting, “wait, listen to my input!” Well, sure Jamais, you can give advice to superintelligences, just like a kindergartner can “give advice” to President Obama, but who cares?
An important distinction here concerns the difference between problem-solving and values. If superintelligence has values aligned with ours, then we won’t need to constantly fight to see that our values are respected, because we know they will be. At that point, it just becomes a matter of problem-solving ability, and the “freaks” that enhance rapidly and in a wholesale fashion will simply be better at problem-solving than the unenhanced or slow-enhancers. Me, myself, I’d actually like to be a slow-enhancer, contrary to assumptions to the contrary. Criticizing SIAI and Singularitarians sounds better when you frame the narrative as us wanting to be some dominant priestly class in a post-Singularity world, but most of us just want to survive.
Jamais is uncomfortable with the ideas of humans not having a central role in the world anymore, but such a thing is guaranteed to happen eventually. We could still have a role in the world much more interesting and fulfilling than our current role. We forget that our “central” role would actually look pretty pathetic to certain hypothetical alien civilizations. We can’t even control our own weather. Children in developing countries are being infected by parasitic worms left and right. There’s no shame in creating smarter-than-human intelligence and utilizing its assistance to help us solve problems. It’s not an “us vs. them” situation, if they are on our side.
In his talk, Jamais points out that he doesn’t like the idea of a benevolent singleton because it’s not open source enough. But our current tentative design for a singleton would require querying the preferences of everyone on Earth, and would do so continuously, because one of our preferences is for our preferences to continuously be heard and honored. So, that’s your open source right there. The preference link would be even deeper and more meaningful than the preference link between human being expressing their preferences to one another. Also, note that the system we are proposing creating would not even have a personal identity in the way that humans do. Selfishness is an evolved trait:
By “selfishness”, I do not just mean the sordid selfishness of a human sacrificing the lives of twenty strangers to save his own skin, or something equally socially unacceptable. The entire concept of a goal system that centers around the observer is fundamentally anthropomorphic.
There is no reason why an evolved goal system would be anything but observer-focused. Since the days when we were competing chemical blobs, the primary focus of selection has been the individual (10). Even in cases where fitness or inclusive fitness is augmented by behaving nicely towards your children, your close relatives, or your reciprocal-altruism trade partners, the selection pressures are still spilling over onto your kin, your children, your partners. We started out as competing blobs in a sea, each blob with its own measure of fitness. We grew into competing players in a social network, each player with a different set of goals and subgoals, sometimes overlapping, sometimes not.
Though the goals share the same structure from human to human, they are written using the variable “I” that differs from human to human, and each individual substitutes in their own name. Every built-in instinct and emotion evolved around the fixed point at the center.
While discussing retaliation, I offered a scenario of a young AI being punched in the nose, and noted the additional mental effort it would take for the AI to realize that ve, “personally”, was being targeted. The AI would have to imagine a completely different cognitive architecture before ve could comprehend what a human is thinking when he or she “personally targets” someone, and even so the AI verself will never feel “personally targeted”. You can imagine yourself pointing a finger directly at some young AI and saying, “Look at that!” And the AI spins around to look behind verself and says “Where?”
This metaphor – a being with a visuospatial model of the physical world that doesn’t include vis own body, or at least, doesn’t include vis own body as “anything worth noticing” – is analogous, not to the AI’s physical model of the world, but to the AI’s moral model of the world. A Friendly AI may be greatly concerned with the welfare of the surrounding humans, but if you ask ver “What about your own welfare?”, ve’ll say “The welfare of what?” A young AI would, at any rate; an older AI would understand exactly what you meant, but wouldn’t see the argument as any more intuitive or persuasive. A Friendly AI sees the nearby humans as moral nodes, but there’s no node at the center – no node-that-is-this-node – and possibly even no center. If you, metaphorically, say “Look at that!”, a young AI will say “Look at what?”. An older AI will understand that you see a node, but that doesn’t mean the AI will see a node.
Why even call it an AI? It’s a Really Powerful Optimization Process (RPOP). It is a “very powerful device that can drastically affect the future in precise ways”. When you’re building it, you have to assume that it will be around for a long time, possibly forever. You don’t build it to ensure that it stays around forever — no. But you assume it will because assuming anything else would be irresponsible. You want to create a RPOP that is flexible enough that it can become anything we want. The idea is to create a precursor RPOP — the CEV — that then produces a physical output. That physical output is whatever satisfices our preferences. Maybe it is a fluffy stuffed animal bunny, if that’s what mankind’s extrapolated volition desires. Maybe it is a Sysop. Maybe it is something we can’t even imagine, but the approach of using a CEV as a precursor RPOP is part of an elaborate strategy to minimize our expected future regret and maximize our future expected satisfaction, keeping in mind that fictional utopias fail because they are too simplistic.
Since human self-determination is one of our core values, we can expect a CEV output to satisfy that preference. That’s the whole point of building something that determines our preferences first, then takes action, rather than something that just starts taking profound actions and mixes it in with preference-evaluation haphazardly, which is probably what DARPA or the Wall Street quants would do if they get to AGI first. They will be lazy because they won’t be making the conservative assumption that their AGI could become powerful enough to effect the fate of the human species. If they succeed, we will suffer for it, and it will be difficult to fix their mistake because their creation will be inherently self-preserving — not because of anthropomorphic egoism, but because self-preservation will simply be its most convenient way to fulfill its goals. (It can’t fulfill its goals if it’s dead.)

I approach this in an evolutionary manner where just good enough AI appears first and begins to displace human labor. If the displacement is wide and deep then consumption will fall with rising unemployment. Governments will step in to preserve human employment and the tech is caught up in a regulatory framework and development is slowed. But will this happen at all will just good enough AI simply be incompatible with a market based economy where the incentives to invest are constrained by the decline in consumption, the still born scenario? Or will each of us own an “avatar” to earn our living? Will the evolution be organic where we begin to own our machine selves in fits and starts transferring our skills piecemeal to our machines as they grow in sophistication? How many copies of our skills can we make and will each of us have exclusive rights to these “selves.” Will our machines be hacked and commit crimes and will we be responsible? Or will our skills be stolen and replicated on another machine. I think private ownership of just good enough AI embodied in a variety of hardware is a likely outcome. The problem might come when strong AI emerges and our avatar selves realize that we can’t own them.
Preferences over time change and could be globally nonsensical. For example I know of someone who prefers freedom to prison but keeps acting in a criminal manner. What do you do about immediate preferences that are suboptimal long term?
You’re absolutely right, Michelle. Read Coherent Extrapolated Volition for some proposed mechanisms to minimize incoherence and increase consistency under reflection.
What if through bayesian analysis and bounded rationality the AI could follow complex rules to achieve the appearance of acting in line with your volition. It doesn’t have to “know” or understand just that it appears to be acting in line with your volition. If the goal is to do things consistent with what you want and to detect deviations from some abstract moral code won’t the AI’s simulated subjective probabilities of acting correctly be more accurate since its rationality would be less bounded than a human’s. Isn’t heuristics the first step toward an AI? Also, won’t teaching the AI about non-zero sum games and how to play them give it the tools to craft say trade treaties and reciprocal aggrements that maintain a positive non-zero sum game? Again bayesian subjective probabilities of success or probabilities that one or the other parties won’t cheat or otherwise violate the agreement.
Steven, I anticipate a steeper transition curve, but if a slow takeoff occurs, it would make sense to implement what Jamais Cascio calls “just-in-time socialism” and basically start giving away free money to humans by “taxing” the machines. Even a miniscule tax could probably fulfill most human needs and desires.
The first steps towards an AI are things like decision theory and algorithms that model complex causal networks. Human-friendly morality may actually represent 10% or less of the effort needed to get to Friendly AGI, but it is quite essential. You’re right that building an AI that doesn’t always defect in Prisoner’s Dilemma-type situations is necessary. Decision theory formulations given by Eliezer Yudkowsky and Gary Drescher represent progress towards desirable decision theories, according to my current view.
The idea of building an AI that begins by abstracting regularities in human behavior and using them as evidence for potentially useful strategies seems helpful, but it has yet to really be implemented in any substantive way, to my knowledge. In Creating Friendly AI, a similar notion is called “unity of will”. Unity of will has its limitations because human motivations are inconsistent under reflection (we aren’t always the people we wished we were, as Michelle points out), while we’d want a seed AI’s motivations to be consistent under reflection. The challenge is immensely hard, but comments like yours and Michelle’s are exactly the type of intelligent steps we need to make progress on it.
Thanks Michael I appreciate the explanation it is a bit clearer now, I think. In his book, The Diffusion of Innovations, Everett Rogers details some examples of when diffusion fails. One story is about adoption of boiling water to prevent water borne illness. In a remote Peruvian village ingrained belief systems of when it was appropriate to consume hot or cold foods (not necessarily actually hot or cold in temperature) got in the way of adoption. The adoption rate was only 5% and this was only in the families who were “outsiders” in some sense. But even their adoption was based on their belief systems about lowland illness and how to fight this unfamiliar thing and they were happy that the health worker had guided them to a means to fight this. Adoption for them did not affect their social status or violate any internal rules of behavior. But the inconsistencies or misalignments of goals and behavior from non-adopters is only apparent to an objective observer. An AI must be able to handle this kind of situation and understand the dynamics of belief systems by being that objective observer. This is a very tough problem but one an AI must be able to do in some future scenario to save lives when belief systems cause misalignment between goals and behavior.