Making a chat bot

Tuesday, December 15, 2009

Don't forget the FREE stuff!

Make a chat bot is hard. You need to design your topic and conversation carefully so you can cover most of your user's questions. You need to spend extra effort so you can handle stupid user spelling mistakes. You probably need to develop some adapter / interfacing code so you can look up a database or something. Even after you did all that, your bot will still look bad because one thing: human curiosity!

When people know they are talking with a chatbot, they tends to screw with it (probably because it would be improper to screw with a real person). They would try out things that will be way out of topic for your special bot. And if you don't handle it gracefully, they will think your bot "can't do anything!" even your bot handles relevant question perfect well.

Fortunately, their are existing chatbot that are made to do these kind of "free talk" with people. They are not build for information retrieving but more of a leisure chatting. The most famous bot is AliceBot, based on AIML. Even I think my CML is better and much easier to build a good information retrieving bot from scratch, I can't compete with AIML's AAA set when things come to "leisure talk".

AAA set contains 47205 conversation or "categories" in AIML's term. It took years to build and will make your bot looks smart when user went out of topic. The best part is, it's completely free. It would be stupid to not making use of this set (or a subset of it). Your chatbot can answer all your domain/topic related question. When things go out of scope, let them fall to the AAA set, so your bot can deal with unrelated question gracefully.

This is exactly what I did in making CML's Cindy Demo bot. It handles CML, shopping assistance Demo and weather questions. Everything else is handled by Plugin ActionProgramD, which is a open source plugin for a ProgramD version AliceBot, capable of running the unmodified AAA set. This way, a fairly smart chatbot is built in the matter of several hours.

Sunday, December 6, 2009

Create a "Hello" CML bot

To create a simple CML bot, you will need Java runtime, CML development toolkit and your favorite text editor. This toolkit comes with a CML interpret Cindy, Plugin examples and some CML examples.

To run Cindy, you need to have "java" in your PATH
Run CindyGui.bat (for Windows) or CindyGui.sh (for Linux). Bot configure file (by default is mybot.xml) defines the source list to start up the bot.
Readme.txt contains useful information about what's included in the toolkit and some instructions.
You need JDK (javac and jar) to compile Plugins

You can simply use free NovaBot hosting service. There is also a standalone Tomcat bot server there for download as well.

Now, let's look at a Hello Conversation:

< CML version="0.1" >
< Conv >
< Pattern > HELLO *< /Pattern >
< Answer > Hello there. < /Answer >
< /Conv >
< /CML >

Save this file as hello.cml in the unzipped toolkit directory. This document will assume this is your working directory by default. Open mybot.xml, change the value of "SourceList" to be hello.cml. Then start CindyGui in this directory. If anything is wrong, guilbotlog.txt will show you error's location. Also, guilbotlog.txt will show you exactly how each conversation is matched and executed.

If everything is right, you will have the bot prompt there. Type in "hello", the bot will respond you with "Hello there". guilbotlog.txt has the detailed logging information for the bot source loading and execution. You can change "LogLevel" setting in mybot.xml to have less / more logging. The log is your most powerful debugging tool for now. You can see exactly which line is
matched and executed.

Let's look at the tags used here.

CML

CML file must be enclosed in Tag CML. CML can take an optional attribute "version". Currently, Cindy interpreter only accepts version <=0.5

Conv

Conv is short for Conversation. It is the center piece of CML. It works somewhat like function or subroutine in a programming language. It is defined as one exchange of information between chat bot and a human user. Conversation may have an optional Name which can be used to reference to.

< Pattern> tag defines the string user input will be matched against. It is a case insensitive word by word matching. Wild card * can be used to match again zero or any number of words. The robot will try to match human input to the < Pattern> and start processing, possibly respond to human with the content defined in the < Answer> tag. You can have multiple Patterns in a conversation. Those Patterns will have a logical "OR" relation, which means if any Pattern element in this Conversation is matched, this Conversation is considered matched.

The < Answer> tag defines the bot response. In our example, the response is always "Hello there". < Answer> tag can contain plain text, C-Expression, other tags or even Action. So the response can be very flexible. You don't have to put all processing in the Answer element. All elements except Pattern and sub Conversation in a matched Conversation will be executed sequentially. However, only Answer and Ask can output to human user.

CML interpreter will always try to match other words first before it try to match *. So if you define another conversation like this:

< CML version="0.1" >
< Conv >
< Pattern > HELLO Cindy < /Pattern >
< Answer > Hey, I am glad you know my name.
< /Answer >
< /Conv >

If you add these lines to your hello.cml, and ask bot "Hello Cindy", bot will respond you with "Hey, I am glad you know my name." instead of "Hello there".

Introduction to CML

CML stands for Conversation Markup Language. It is used to develop chat bot like this one. The design idea for CML is to design a simple mark up language that can be used to manage "content" of a chatbot. CML has the following key features:

1. A smart mis-spelling correction system. CML takes a bot-author provided dictionary file for mis-spelling correction. It will also try to automatically correct some common other forms like Present Continuous(ing), Past Tense (ed), etc. This type of mis-spelling correction is recursively performed. So a mis-spelling past tense could be correct to the original form of the verb. This greatly reduced the number of "match pattern"s needed for a pre-stored Question and Answer Conversation. Furthermore, this system is integrated with the context based sore system, so correction will be applied only when they are in the right context and make most sense.

2. Tree-based conversation model. Conversation tree is a concept used in many pattern match based chat bot system. In CML, conversation tree is litterally a XML tree. Each follow up conversation / question is coded as sub-node of the current conversation. So it is very easy to make a wizard / problem solving chat bot by present a serier of follow up questions or suggested solutions. Conversations are organized into Topics. Topics themselves are organized in tree structure as well. So the bot source will be very easy to read and following. This structure also provides great benefits in context support.

3. Inherent strong context support. CML provides strong context support by using Conversation tree, topic matching and session variables. Human conversations are usually continuous. Current conversation is usually a follow up of the previous conversation or previous topic. With tree-based conversation model, a conversation will inherently know itself is a followup or not, can expect what will come up following this conversation and understand the previous situation. When the next conversation do change topic, usually there will be some key word / phrase clear indicate this situation. Topic matching process can recognize that and make the conversation gain the "context" of that topic. Session variable is also a very powerful tool. Using session variable, chat bot's client interface can pass "context hint" to the chat bot, "remember" certain informations that was provided by human user in previous conversation or some other Plug-in result that could be presented later.

4. Simple pattern matching syntax. It uses a very simple wide card based pattern match syntax for conversation pattern matching instead of a regular expression system (does support that through it's build-in function though). So bot-author can be non-devloper.

5. Modulized and reusable conversations. CML conversations are like function module in a programming language. A well-written conversation sub-tree can be referenced and used in other conversations. For example, if a bot wrote a conversation to ask for custommer address, he could use that Conversation substree in anywhere when the bot need human to provide address. This also make the bot source more organized and readable.

API VS mark up language

For a chat bot system's implementation, a lot of people uses a pure API based system. The obvious reason is for flexibility. When you are using an API based system, your bot author are trained developer. Your bot are exposed to the full functionality of a real programming language. It will very easy to make your bot to interface with different chat system, like chatroom, AIM, liv messenger, etc. The short coming is, it is hard to maintain the "content". For a smart chatbot loaded with complex knowledge to do something useful. It is not easy and staight forward to put those in the code.

Naturally, developper will tend to organize those "content" related information in a data file. However, this content could be quite complicated and have its own structure. Along this line of thinking, AIML is invented. It did quite a good job in organize questions and answers. It also have some context support by introducing variables and javascript, so some degree of flow control or programming interface can be utilized. It also divide the role of a developper and bot author. From now on, bot author can just be tech writes or people has domain knowledge with limited xml training. They can be in full charge of a chat bot's behaviour. Fairly comparable to webmaster in charge of a website.

CML made the next leap. It provides sophiscated context support by introducing tree-structured topic and conversation, context-aware mis-spelling correction and a tag based semi-programming system. It also provides a java plug-in interface so you can access the full functionality of a real programing language is necessary. It can be as easy as AIML with better context recognition and match rate but as functional and flexibal as an API based system. You can have Bot author to maitain the "content" side of a chatbot and a developer to code for plug-ins to interface with a enterprise system. It is an attempt to have the best of the two sides. And it is doing that fairly well. Without any plug-in, it is already a better bot system than AIML due to it's context support, mis-spelling correction and a score based match system. With plug-in, you can do all the tricks an API based system can do.

Monday, November 23, 2009

Spelling correction

We are terrible at spelling. And in the age of internet, it is definitely getting worse. When we are chatting online or texting, we use all kinds of short cuts to save a couple of key strokes. What is the last time I actually typed "see you later" instead of "c u"? Seems to be a century ago. Also, a lot of time I seems to be typing "thx" instead of "thanks", or sometime "thaks".

A pattern matching based conversation agent can't handle those mis-spelling directly. Basically, there are three ways to tackle this problem. Method one is using some sophisticated matching algorithm that is capable to match words against likely mis-spellings, like able to match "restaraunt" to "restaurant", or "thaks" to "thanks". This can generate some amazing results. But it tends to be quite slow, and usually not working well with those internet slang. Second way will be actually program all the possible mis-spelling combination into the pre-stored conversation info. This works well for a small set. It can catch special phrase like "lol" and "c u". However, when the conversation set getting bigger, those pre-stored mis-spelling will grow in exponential. The third way (which I consider best) is to have a word by word (some phrases as well) mis-spelling dictionary and a special algorithm to match mis-spelled words to correct spelled words. The dictionary can work in two ends. One end is at the chat client, acting like a prompt/hint. Pigdin have a small dictionary works that way. And everybody is familiar with Google's search prompt. The other end is to work on the backend, automatically fixing those mis-spellings.

Personally, I prefer the spelling correction happens on the backend instead of on the front end. Type a few keyword and have a hint/prompt on the finger tip is one thing. Having that thing for everything I'm typing on during a chat session is just annoying. And I don't want be corrected when I typed "thx" as I'm typing on.

There is one problem with spelling correction on the back end. English is ambiguous. A mis-spelled word can be corrected in multiple ways. And not all "u" should be corrected to "you". However, a conversation agent does have a distinct edge, the Context. With the help of context, we can usually correct mis-spelling properly. In fact, this is exactly how human do it. Based on the context, we can ususally recognize the other people's typo and correct them in our mind with very little ambiguity.

Sunday, November 22, 2009

It's the Context, Stupid!

What ultimately constructs a logical and proper response to a question? What's the edge a conversation agent have over a search engine? What makes a chat bot / chatter bot smart? It is the recognition of CONTEXT!

Here is an example conversation happen between a real human and CindyBot:

Human : How's the weather like ?
Cindy : Where do you live ?
Human : Kirkland, WA
Cindy : 42.6F / 5.9C, Scattered Clouds - 8:45 PM PST Nov. 22
Human : tomorrow?
Cindy : Cloudy with a chance of rain in the morning...then rain likely in the afternoon. Highs in the 40s to lower 50s. Southeast wind 10 to 15 mph.

Look at those human's inputs, without the context establish by the first question, "tomorrow?" will make no sense. Imagine what will be Google's results for keyword "tomorrow"? Cindy was able to respond to "tomorrow" because it recognized the context. Context here means a follow up question about weather in Kirkland, WA. With this implicit information, Cindy can go and retrieve tomorrow's weather info for Kirkland and generate the proper response.

Context is not necessary previous conversation. Context can be many things. It can be the current web-page a customer is browsing on; it can be items in the customer's cart; it can be current time, location or even the browser version the user is running on. CML provides strong context support based on these information. Basically, the idea is the next conversation is likely to be a follow up of the previous conversation, or on the same topic. And a Web-page should be able to send "hints" like current page, current items in the shopping cart to the conversation agent to augment it's response generation.

Two basic approaches of making a chat bot

Chat bots are usually developed using two approaches. The first approach uses Natural Language processing algorithms. This involves both linguistic analysis and "understanding". "Understanding" is not very well defined task. Most time this means some form of matching pre-stored information and using some logic reasoning to produce a response. The term "understand" is inherently meaningless to me. Because the only available criterion to demonstrate "understanding" is the ability to produce valid responses. NLP is a very hard problem. Though some NLP system performs better than others, it is fair say that, to this day, there is no general purpose NLP can provide conversational artificial intelligence.

The second approach is pattern matching. This is the method used by most chat bots since the 1967 Eliza bot. This is basically the same approach used by search engines. In a chat bot system, pattern matching is usually augmented by other techniques to produce better results. For example, Kyle combine real-time learning with evolutionary algorithms to optimise their ability to communicate based on each conversation held.

The latter approach also works well for the purpose of information retrieving (just as search engine), which is one of the most practical use of chat bots. Some specialized software or programming languages are created specifically for this narrow function required. For example, A.L.I.C.E., utilises a programming language called AIML which is specific to its function as a conversational agent. My CML and Novabot is also such a system. In fact, it is inspired by AIML. However, I did add my own "augments" to the general pattern matching and focussed on the convesation CONTEXT.