I am watching Jamais’ full talk at the New York Future Salon, and I have a comment about the assertion Jamais makes around 34:30. He says that many of the people working on AGI at the Singularity Institute are biased and not appropriately acknowledging the way that their cultural and political biases may work themselves into an AI’s programming — that they view themselves as objective scientific types free from bias.

I find this surprising. Looking back at the intellectual history of the Friendly AI research program, the entire book-length treatment of the topic published in 2001 addresses dozens of questions of the form, “What if the programmers get X wrong?”, or, “What if the programmers are biased about X?” Essentially, Eliezer approaches the problem assuming that the programmers will mess up practically everything, and asks, “How can we still get a friendly outcome?” For instance, there is a section on layered mistake detection. The work goes on for hundreds of pages on this very theme — in fact, it is the unifying theme of the document. Shorter summaries of the ideas can be found at “SIAI Guidelines on Friendly AI” and “Features of Friendly AI”.

It was Singularity Institute researchers that first came up with the idea (to my knowledge) that hand-coding motivational rules for AI is pointless and counterproductive, because rules are subject to reinterpretation, and the programmer cannot foresee every possible scenario in which the rules may not hold. If the Singularity Institute didn’t figure that one out, how long would it have taken another group to do so? It’s hard to say, but as far as I know, no roboethicist outside of the SIAI community even bothers to talk about it.

It was a Singularity Institute researcher that came up with the idea of “wisdom tournaments” — test beds where arbitrary pieces of an AI’s programming are manipulated, and the AI discovers new ways of preserving its human-friendliness even under the worst foreseeable circumstances, including random errors caused by cosmic rays hitting the computer the AI is running on. It was an SIAI researcher that first came up with the idea of building goal systems completely without observer-centered moralities, and explained why the idea of an observer-centered morality is fundamentally anthropomorphic. Even today, the vast majority of philosophers and AI researchers do not understand why observer-biased beliefs exist in imperfectly deceptive social organisms, and why these mental features do not generalize to all possible minds. It was Singularity Institute researchers that came up with the idea of probabilistic supergoals. I could go on and on.

Singularity Institute Friendly AI researchers are characterized by meticulous self-doubt and intellectual conservatism. See, for instance, “Assumptions conservative for Friendly AI”. The difference between conservatism in Friendly AI and conservatism in futurism (they’re opposite things) has led some to believe that SIAI strongly believes in radical futurism, but what we are simply trying to do is be conservative, because even if there is a 1/100 chance of human extinction via poorly programmed self-improving AI, that is a big deal.

It is this conservatism and pervasive self-doubt which originally attracted me to SIAI as an organization. Unfortunately, it seems that very few of our critics know where to look to find all this self-doubt. Even though large parts of Creating Friendly AI are out of date, this document is an excellent example of our attitudes towards Friendly AI design. This tradition of self-questioning and constant self-scanning for biases can also be seen during conversations with Anna Salamon and Steve Rayhawk, our two most-recently hired researchers.

For Pete’s sake, our organization has created an entire website and community devoted to discovering and attempting to ameliorate cognitive biases — Less Wrong. If Creating Friendly AI and Less Wrong cannot persuade Jamais that we are extremely self-reflective and self-doubting, then I’m not sure what can. Perhaps all the rhetoric about bias leads Jamais to believe that some us believe that we ourselves are free from bias, because we are better than your average intellectuals at identifying and addressing biases, but nothing could be further from the truth. We are acutely aware that the biases we are examining are just the tip of the iceberg, and the additional biases that influence our thinking every day are colossal. Part of our strategy for minimizing the impact of our biases on AI design is to leverage an infant AI’s intelligence to reflect on our design choices and modify them accordingly. This is not a magic solution by any means, but few other AI researchers even stop to consider such strategies. Because an AI could be designed from scratch to reason in accordance with the laws of probability theory, it would by design be immune to many of the biases that plague humans, though it certainly wouldn’t be perfect by any means.