Killer Robots and the Many Ways in Which AI Could Go Wrong

On October 19, 2012 an international coalition with the incredibly cool name of, The Campaign to Stop Killer Robots (yes, there are bumper stickers), formed to call for a preemptive ban on Lethal Autonomous Weapons (LAWS).

The Campaign to Stop Killer Robots is comprised of 63 non-governmental organizations (NGOs) including organizations like Human Rights Watch, Amnesty International, and Women’s International League for Peace and Freedom. And while killer robots might sound like little more than the plot line to what is arguably one of the greatest Flaming Lips songs of all time, it’s actually not that crazy. LAWS are likely only years, not decades away, and they have the potential to herald “the third revolution in warfare, after gunpowder, and nuclear arms.”

It’s not hard to imagine how an army of killer robots could go wrong. What if they fell into the wrong hands? What if they turned on all of us? What about the millions of military personnel that will be out of work thanks to machines with human-like intelligence?

And these three questions largely represent the full scope of AI fears. Open Philanthropy is currently researching the potential dangers of transformative AI. Its early research provides more tidy categorizations for the above questions:

  1. Misuse Risks
  2. Accident Risks
  3. Other Risks

Open Philanthropy’s work is primarily concerned with “global catastrophic risks,” risks that “could be globally destabilizing enough to permanently worsen humanity’s future or lead to human extinction.” But really these risks will show up wherever there is AI — whether that AI is powering military weapons or marketing campaigns. And understanding these risks is valuable to anyone working with AI, whether you’re an engineer or a CEO.

So, let’s take a look.

1.Misuse risks: These risks are all about what humans can do by choosing to misuse powerful AI technology. Of course, when it comes to something like killer robots, there’s a whole other question about whose hands are the “right hands.” This is the same kind of risk we took on by developing nuclear weapons. We create a monster that can be used for either good or evil. Holden Karnofsky, one of the researchers working on the Open Philanthropy project, summarizes with this:

I do not think we should let misuse scenarios dominate our thinking about the potential consequences of AI, any more than for any other powerful technology, but I do think it is worth asking whether there is anything we can do today to lay the groundwork for avoiding misuse risks in the future.

This sounds reassuring, but again, Open Philanthropy is only concerned with global catastrophic risks. Misuse risks have serious implications for business and individuals. Just look at the stream of companies constantly getting their hands smacked over data misuse (We’re looking at you, Uber and Unroll.me).

Data is the gasoline of artificial intelligence. As AI gets better, it will get harder for both governments and corporations to pass on the temptations of that powerful fuel.

2. Accident risks: These risks are about how AI could go wrong due to human error. In other words, what if humans build a killer robot so smart that it decides it doesn’t want to work for humans anymore, and would rather, in fact, destroy humanity. Oops! Someone should have caught that goal mismatch during the last code review.

This humanity-destroying AI could come in the physical form of a killer robot, or it could just be a killer algorithm with some mean hacking abilities (see short below).

Funny YouTube videos aside, accident risks are worth digging deeper into because they are, by definition, preventable. In a paper released last year, Google Brain researchers along with researchers from Berkeley, Stanford, and OpenAI identified five concrete problems in AI safety.

The researchers use an example of a cleaning robot — harmless enough to not cause too much psychological discomfort, so let’s start there.

  1. Avoiding negative side effects: You want your cleaning robot to clean, but not at the expense of frightening the dog or knocking over a glass of water.
  2. Avoiding reward hacking: This might be the most human-like of all responses. Imagine a robot rewarded for clean spaces that chooses to disable its vision (see no mess, clean no mess), cover messes, or just hide whenever you’re around.
  3. Scalable oversight: You want your cleaning robot to throw out an empty tube of toothpaste, but not your wedding ring. Will the robot know the difference?
  4. Safe exploration: This has to do with learning. You want your cleaning robot to get better and more efficient over time, without hurting itself in the process. The researchers use a mopping example — you want your robot to experiment with mopping strategies, but never put a wet mop in an electrical outlet.
  5. Robustness to distributional shift: This is another learning situation. If you train your cleaning robot in your small, minimalist, one-bedroom can it apply that learning to your parent’s house which recently has started to look like an episode of Hoarders?

Okay, so none of this sounds too bad, right? We’re talking about cleaning robots with behavior patterns not unlike a precocious five year old. But what happens when our army of killer robots tasked with keeping the peace happens into some “reward hacking”? Determining, perhaps correctly, that the best way to keep the peace on earth is to simply eliminate humans?

Bad robot!

But there are also some very present-day applications of these concrete problems. Last year, a Tesla in self-driving mode, was involved in a fatal crash for what boils down to “robustness to distributional shift” — the car was unable to detect the white side of a tractor-trailer against a brightly lit sky. The car had “learned” one thing in its training environment, but was unprepared for dealing with a much less predictable reality.

Or how about the scene in Alien:Resurrection when Ridley hops into the surgical machine designed only for men? She’s trying to find a category for “extract alien out of my uterus,” a scenario in which she’s banking on “robustness of distributional shift.” She needs the AI to understand the unique factors of her situation — being a woman hosting an alien — and make the appropriate decisions. (Spoiler: it does, and it’s an awesome scene.)

Also in the world of self-driving cars is the problem of “avoiding negative side effects.” If you’re about to hit someone, obviously you want your car to swerve to miss that person. But what if avoiding the person means you hit a wall? Or what if there’s a whole group of people in front of you and swerving means hitting a lone person? What should an “ethical car” do?

Or what about the issue of “scalable oversight” in the context of conversational UIs. Late last year a YouTube video began circulating of a toddler asking his Alexa for his toy — “Digger Digger.” Alexa misunderstood, delivering unexpectedly NSFW (or toddlers) responses.

It’s not hard to imagine a multitude of accident risks in the business context. Task an AI with running an email marketing campaign and it just might decide to scrape the web and turn into a mega spammer. Put it in charge of hiring and it might decide to hire only white men because minority employees churn at a higher rate. Ask it to grow your Twitter following and it might spawn its own army of follower bots.

3. Other risks: This is a dumping ground for less easily categorizable risks, like what happens when killer robots put millions of military personnel out of work? These are not sudden risks that result from an error; rather, these are creeping changes that threaten to upend our social order as AI rises in popularity.

For example, we’re on the cusp of tech-enabled humans at a scale that will make our iPhones look like Chutes & Ladders. Tech that will help people see, hear, communicate through our skin, and find healing at the hands of nanobots living in our bloodstream is already being developed. These advances will improve the lives of millions of people, but also bring with them new risks.This same technology could, and likely will be, used to give perfectly normal people superpowers.

One of the more bizarre risks of transformative AI is the potential emergence of a race of super humans with access to technology that makes them smarter, faster, and stronger than any human before them. Perhaps, even, immortal (see Project Gilgamesh).

A less futuristic example of how AI could have negative social impact can be found in crime detection technology. Last year, ProPublica released a story exploring how crime risk assessment software developed by Northpointe, was inherently biased:

“Overall, Northpointe’s assessment tool correctly predicts recidivism 61 percent of the time. But blacks are almost twice as likely as whites to be labeled a higher risk but not actually re-offend. It makes the opposite mistake among whites: They are much more likely than blacks to be labeled lower risk but go on to commit other crimes.”

The algorithm on its own is harmless, but becomes quite dangerous when people take action on it, which is exactly what it is designed to do. This has implications outside of crime.

In 2015 Breitbart News published a story on how women drop out of STEM fields at a higher rate than their male counterparts. The takeaway was that there should be a cap put on how many women should be allowed to enter STEM programs. You might scoff at this, yet this human conclusion is very similar to the conclusion reached by Northpointe’s software.

At least until recently, if you searched “C.E.O.” on Google the image results showed 11% women. Actually, 27% of CEOs in the United States are women. Start typing “Are transgender” into Google or Bing and autocomplete would prompt you with “going to hell.”

Machines are incapable of considering the systematic factors creating certain patterns in the data. Though to be fair to the machines, many humans are, as well.

The primary risk here is that humans interpret the recommendations of the machines as being somehow bias-free. Machines can only hold a mirror up to human behavior. We must be wary of interpreting the reflection as being a prediction of the future.

A fourth risk category?

Perhaps a better label for the range of risks in the “other risks” bucket might be something like “human response risks.” In other words, risks that don’t arise from the engineers building AI or the AI itself, but rather risks that arise based on how humans respond to AI — either as individual end-users taking action on algorithmic recommendations, or as a democracy choosing how to live in an AI-enabled world.

Which brings us to this present moment.

Transformative AI with human-like intelligence is coming. AI is already creeping into our lives with many of us not even realizing it. And what we choose to do today, what we choose to speak up for today, the people we vote into office today, what we let companies do with our data today, the ways in which we educate ourselves about AI today will determine the state of our tomorrow.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s