Sunday, December 6, 2009

Introduction to CML

CML stands for Conversation Markup Language. It is used to develop chat bot like this one. The design idea for CML is to design a simple mark up language that can be used to manage "content" of a chatbot. CML has the following key features:

1. A smart mis-spelling correction system. CML takes a bot-author provided dictionary file for mis-spelling correction. It will also try to automatically correct some common other forms like Present Continuous(ing), Past Tense (ed), etc. This type of mis-spelling correction is recursively performed. So a mis-spelling past tense could be correct to the original form of the verb. This greatly reduced the number of "match pattern"s needed for a pre-stored Question and Answer Conversation. Furthermore, this system is integrated with the context based sore system, so correction will be applied only when they are in the right context and make most sense.

2. Tree-based conversation model. Conversation tree is a concept used in many pattern match based chat bot system. In CML, conversation tree is litterally a XML tree. Each follow up conversation / question is coded as sub-node of the current conversation. So it is very easy to make a wizard / problem solving chat bot by present a serier of follow up questions or suggested solutions. Conversations are organized into Topics. Topics themselves are organized in tree structure as well. So the bot source will be very easy to read and following. This structure also provides great benefits in context support.

3. Inherent strong context support. CML provides strong context support by using Conversation tree, topic matching and session variables. Human conversations are usually continuous. Current conversation is usually a follow up of the previous conversation or previous topic. With tree-based conversation model, a conversation will inherently know itself is a followup or not, can expect what will come up following this conversation and understand the previous situation. When the next conversation do change topic, usually there will be some key word / phrase clear indicate this situation. Topic matching process can recognize that and make the conversation gain the "context" of that topic. Session variable is also a very powerful tool. Using session variable, chat bot's client interface can pass "context hint" to the chat bot, "remember" certain informations that was provided by human user in previous conversation or some other Plug-in result that could be presented later.

4. Simple pattern matching syntax. It uses a very simple wide card based pattern match syntax for conversation pattern matching instead of a regular expression system (does support that through it's build-in function though). So bot-author can be non-devloper.

5. Modulized and reusable conversations. CML conversations are like function module in a programming language. A well-written conversation sub-tree can be referenced and used in other conversations. For example, if a bot wrote a conversation to ask for custommer address, he could use that Conversation substree in anywhere when the bot need human to provide address. This also make the bot source more organized and readable.

No comments:

Post a Comment