Before AI was hot, Henry Lieberman, a computer scientist, invited me to see his group’s work at MIT. Henry was obsessed with the idea that AI lacked common sense. So, together with his colleagues Catherine Havasi and Robyn Speer, he had been collecting commonsense statements in a Web site.
Commonsense statements are facts that are obvious to humans but are hard to grasp for machines. They are things such as “water is wet” or “love is a feeling.” They are also a sore spot for AI, since scholars are still working to understand why machines struggle with commonsense reasoning. On that day, Henry was eager to show me a chart in which words, such as love, water or feeling, were organized based on the data from their commonsense corpus. He showed me a plot using a technique called principal component analysis, a method to determine the axes that best explain variation in any type of numeric data.
“Applied to commonsense knowledge,” he said, “that’s like trying to find a mathematical answer to the ancient philosophical question: What is human knowledge all about?”
So, I asked him what the axes were, and he invited me to guess.
“I don’t know,” I said. “Big or small? Live or dead?”
“No,” he replied. “Good or bad.”
In hindsight, this fact seems obvious. Every day we use moral judgments as shortcuts for both cognition and communication. We talk about “good” and “bad” weather. We look for a “better” job or computer. We enjoy “good” music and avoid “bad” wine. But while we rationally understand that a hole in a sock is not unethical, we often cannot help abusing the sweet shortcut of moralized reasoning. Henry’s chart was evidence that pervasive moralizing is prevalent in our language, and that this pervasive moralizing is implicit in commonsense reasoning.
Today, much of that moralizing is not aimed at the wrong pair of socks but at AI and at those who create it. Often the outrage is justified. AI has been involved in wrongful arrests, biased recidivism scores, and multiple scandals involving misclassified photos or gender-stereotypical translations. And for the most part, the AI community has listened. Today, AI researchers are well aware of these problems and are actively working to fix them.
But as the dust settles, it is worth asking not only whether AI is “good” or “bad,” but what these judgments episodes teach us about our moral intuitions. After all, AI ethics is about us—humans—since we are the ones who are doing the judging.
Over the last few years, together with my team, I ran dozens of experiments where thousands of Americans reacted to actions performed by humans and machines. The experiments consisted of scenarios that could be presented as a human or a machine action, like an excavator accidentally digging up a grave or a tsunami alert system failing to alert a coastal town. These comparisons allowed us to go beyond the way humans judge AI, and focus instead on how our judgment of machines compares to our judgment of humans.
This difference may seem subtle, but it forces us to judge machines in a more realistic frame of reference. We tend to compare AI with perfection instead of comparing it with how we react to a human carrying out the same action with the same result.
So, what did the experiments teach us?
Even the first data points showed that people did not react to humans and machines equally. For instance, people were less forgiving of machines than of humans in accidental scenarios, especially when they resulted in physical harm.
But since we had thousands of data points, we could go beyond anecdotal observations. So, we decided to build a statistical model explaining how people judged humans and machines. The model predicted how people scored a scenario, in terms of moral wrongness, as a function of how they scored it based on harm and on perceived intention.
To our surprise, the model showed that people did not just judge humans less severely than machines, but that we used a different moral philosophy to judge them. The figure nearby summarizes this finding. The blue plane shows how people judge—on average—other people. The red plane shows how humans judge—on average—machines.
You can clearly see that the planes are not parallel, and that there is a rotation between the red and blue plane. This is because, when judging machines, people seem to care mostly about the outcome of a scenario. In this case, the perceived level of harm. That is why the red plane grows almost exclusively along the harm dimension. But when people judge other people—the blue plane—we find a curvature. This time, the growth is along the diagonal, representing the interaction between harm and perceived intention (technically, the multiplication of the two). This explains why machines are judged more harshly in accidental scenarios; people take a consequentialist approach to judging machines, wherein intent is irrelevant, but not to humans.
This simple model led us to an interesting conclusion, an empirical principle governing the way in which people judge machines differently than humans. A principle that in its simplest form says, “People judge humans by their intentions and machines by their outcomes.”
But what do these experiments teach us about our moral intuitions?
First, they teach us that our moral intuitions are far from fixed. We may tell ourselves we are people of principles, but the truth is that we react differently depending on who or what we are judging. This moral flip-flopping goes beyond the way we judge humans and machines.
For instance, people’s moral judgments of politically motivated actions depend on whether the agent accused of wrongdoing is politically aligned with their viewpoints.
In a recent study, people reacted differently to a student seen throwing a glass bottle during a protest depending on whether the student was identified as a member of “the Antifa movement” or “the Patriot movement.” As you probably expect, respondents reacted more strongly against the student when his political identity was opposed to their own. The same study, however, also showed that this effect was not observed when violations involved nonpolitical scenarios, such as indoor smoking or drunk driving. So, our politically motivated judgments do not push us to flip-flop indiscriminately, but rather only in the context of particular situations.
The second lesson is that our moral flip-flopping can go beyond a simple bias favoring one group over another. If people simply favored humans over machines, the red and blue planes would have been parallel. But they are not. This means that people do not simply favor humans over machines. We judge humans and machines differently. Compared with the consequentialist morality with which we judge machines, we apply a more Kantian or deontic morality (about means rather than goals) to our fellow humans.
But what is probably the most important lesson is that we can learn something about human morality by using techniques designed to teach machines. In our case, we did this by building a simple model connecting different aspects of moral judgments. In the case of Henry, Catherine and Robyn, we learned something about the morality of common sense by using a popular “dimensionality reduction technique.”
After Henry told me that the first dimension was “good versus bad,” he asked me to guess the second axis. “Easy versus difficult,”” he said. “To a first approximation, all commonsense knowledge is about what is good or bad and what is easy or hard to do. The rest is just noise.”