Thursday, June 15, 2017

The Next AI Frontier: Teaching Computers to Lie

I got into a minor kerfuffle over at Steve Hsu's blog when I commented on this post about AlphaGo, the neural net program that has blown away all of the human competition in the game of Go. I said that I would be more impressed when computers mastered Diplomacy, a comment which was immediately challenged by someone else. But as impressed as I am with AlphaGo, I stand by that comment.

When I was a kid back in the 60s, the local science museum had a computer that played tic-tac-toe. (The local fair had a pigeon that played tic-tac-toe, but maybe that's a topic for another day). I have no idea how the computer worked -- maybe an analog circuit of some sort? The London science museum at one point had a tic-tac-toe computer constructed from Tinkertoys and string. As primitive as these sound, I think you can draw a straight (but very steep!) line from tic-tac-toe to the computer that "solved" checkers, and then to IBM's Deep Blue (which conquered chess), and finally to AlphaGo. These games (tic-tac-toe, checkers, chess, Go) are all deterministic with perfect information, in which the players alternate taking turns, choosing from a finite number of moves. (To be fair, there is a qualitative leap between the earlier "brute force" programs, which relied on simply increasing the number of moves scanned by the computer, and AlphaGo, in which the computer actually "learns" to play better, and in which it's impossible for the programmers to determine why the computer chose a particular move.)

Diplomacy is a very different animal. For those of you not familiar with the game, it's a contest of almost pure negotiation, requiring the ability to form alliances, offer bribes, bluff, lie, and backstab your way to the top. It requires a very different, more "human" set of skills than chess or Go. Computer scientists are already working in that direction, particularly with the game of poker.

I myself wrote two chess programs back in the 1970s, one in high school (in BASIC), and the second in college (in Fortran, the One True Programming Language). Both of my programs were, in one sense, more advanced than Deep Blue. They cheated.

The cheating was not intentional -- both programs had bugs. When I was testing the BASIC program, the computer moved its rook diagonally, jumping over its own adjacent pawn, to capture one of my pieces. It turned out that once the computer entered the subroutine for capturing opposing pieces, it could never exit. It was compelled to capture a piece on every move it made. This made for aggressive, if somewhat irregular play.

The bug in the Fortran program was more subtle. At one point I moved my queen, protected by my bishop, adjacent to the computer's king -- checkmate! The computer was supposed to note the checkmate and resign, but instead it made a move far away on the other side of the board. And when it printed out the board after its move, my queen had mysteriously vanished! Talk about misdirection. I was able to trace the error -- the computer attempted to capture my queen, saw that this move would leave it in check, and therefore returned the king back to its original square. But when it did this, it forgot to replace my queen.

My teaching computers to cheat was not deliberate, but I have no doubt that we will shortly see computer programs capable of lying convincingly. I don't think it will be a good thing.