Welcome back readers. Another snow-day in Boston; good time to update the blog! In this interactive post I will continue my recap of early experiments in automating characters with recorded gameplay data. You will find a demo to play with at the end.
|Human waitress chats with an AI customer, trained with 5,000 games.|
More after the jump….
The implementation of the chat bot was brain-dead simple, yet results were often surprisingly effective. I started by segmenting the log files into conversations, where a conversation is one or more uninterrupted lines of chat text sandwiched between physical actions (e.g. picking up steak from the table, sitting on a chair, or using the cash register). For each context, I extract a list of all word sequences that occur in at least five games. These sequences are then used to encode lines of chat text in the log files into signatures that can be quickly compared with chat text input at runtime.
The chat bot encodes the user’s text input as a signature, searches for the best matching signatures in 5,000 log files, and then simply outputs the next line from the conversation with the best matching signature. Ties are broken arbitrarily, and there is a history bias to favor conversations that contain previous lines that match the history of the interaction currently in progress.
I want to stress that this chat bot was just a quick experiment, and is not the system I am using in my more recent work. However, this simple implementation turned out to be quite a compelling (and entertaining) way to explore the data, and to illustrate how far you can get without any semantics, just relying on surface text similarities, if you have lots of data. The results are not perfect – not going to pass the Turing test, but compared to other chat bots, it’s not doing so bad, especially considering it requires zero hand-authoring -- just turn the crank and get a chat bot. I never formally evaluated the chat system because it was just an exploration, but some implementation details are covered in this AAAI symposium paper.
I ended up demo-ing this system for years at presentations and lab events. Running it is a bit like Russian roulette. It often works well and impresses users, but there is always the chance of an incorrect, or even obscene, response. The data is full of witty responses and handles a number of cases that I would not have covered if authoring the dialogue by hand. Below are some highlights.
Here is one example of the bot’s wit. In all examples, the bot's response is in bold:
WAITRESS: Welcome to the restaurantCUSTOMER: HiWAITRESS: hi, how many in your party?CUSTOMER: a table for one please.WAITRESS: smoking or non?CUSTOMER: WOW, you have a smoking section?CUSTOMER: I don’t smoke, but I will start tonightCUSTOMER: take me there!
WAITRESS: hiCUSTOMER: hiWAITRESS: do you have a reservation?CUSTOMER: I have personal reservationsCUSTOMER: but I wave them when it comes to lust.
When interacting with a Japanese visitor to the lab, we were both surprised when the bot corrected his English:
WAITRESS: welcomeCUSTOMER: table for one pleaseWAITRESS: have a sitCUSTOMER: *seat
Sometimes the response is not quite right, but still entertaining – better than no response at all. (Note that in this case, the bot is playing as the waitress):
CUSTOMER: you have nice eyesWAITRESS: thanks! :) I work out
The most interesting thing about this data-driven approach is seeing the edge cases that get picked up. When I noticed that one of the files of extracted phrases looked suspiciously small, and investigated which context it was associated with, I found that the system had learned to say “oops” when the waitress dropped something on the floor. So what, right? That’s obviously what you should say when you drop something. Well, what makes this interesting is that the user interface actually does not allow the players to put things down on the floor – they can only put things down on furniture and other objects. Dropping things on the floor occurs as the result of a ray-casting bug, when trying to place something on a table. In this case, the AI system has learned an appropriate response for something the designer never realized could even happen!
Another edge case is related to a fruit bowl in the back of the kitchen. In many games, decorative props like these would simply be ignored by the AI. In contrast, the chat system learns that a waitress should say “on the house” when putting the fruit down on a customer’s table. One of the down sides of learning from recurring patterns of text is that the system fails to pick up lines that might be gems, but are only observed once in thousands of games. For example, in one game the customer responds “Damn girl, that is serious fruitage!” when the waitress puts the fruit bowl on his table. More recent work is looking at ways to capture these gems by including a human in the loop of the data-mining process.
Colorful Interactions (to say the least)
As mentioned earlier, the chat bot does have the potential to offend people. At a presentation in Plano, Texas, the bot offered Paul Tozour a lap dance. When demo-ing at the lab for the VP of a major American corporation, she was surprised to see this:
WAITRESS: How may I help you?CUSTOMER: Get me a table b****
I have to admit getting some satisfaction from a demo for a gray-haired reporter who told me my research didn’t make any sense to anyone over the age of 40, that went like this:
WAITRESS: welcomeCUSTOMER: shut up wh***
I don’t think he actually noticed the bot’s response – couldn’t read it through his bi-focals. Obviously we can censor obscene words, but where’s your sense of adventure? Players can still say some pretty bad things without swearing. Microsoft learned this the hard way when they released a potty mouthed santa clause bot.
I wanted to embed the chat bot applet into this blog post, but Java’s security regime defeated me, and I had to put on its own page. You can find the demo applet here.