I’ve spent a lot of time this week trying different approaches to natural language recognition.
It could have gone better.
There was a recent article on wired.com recently that talked about the exact problem I was having: That Machine Learning (which everyone loves to call Artificial Intelligence) is not really intelligent at all. When you look at the solution in detail, we’re really just brute-force tuning a generalized algorithm until it produces results in line with our expectations.
Technically it is “trainable”, but it’s nowhere near “intelligent”.
I initially thought that my gaming “chatbot” could be just that. A standard chatbot that was tuned to work with gaming rules. The problem I was having was that I wanted the system to be able to understand the player intent, narrate the scenario and adapt to player improvisation.
Have you ever tried to have a conversation with a chatbot? It usually goes something like this:
> HelloHi.> What do you think of Donald Trump?It is a Twinkie.
In cases where the input text does not fall within the expected boundaries, the system still makes a guess and acts on what it calculates as the greatest possibility.
Machine Learning solutions are NOT good at improvisation, but that’s a problem for later.
Most chatbots take a string of input, remove the “unimportant” words and create artificial neural input in the form of a “bag of words”. A bag of words is a collection of words extracted from a sentence. Based on the data we can tell what words were in the sentence, but not in what order they were.
You might think this would be a big problem, but for most simple sentences it is not. Most grade-schoolers at some point are given a bag of words and asked to form a sentence with them. In most cases, they can correctly reassemble the sentence.
But let’s say we have a scenario in our game like this:
You have a table in front of you with two plates. One red and one blue. There are also two cups. One red and one blue.
Now, let’s assume that to proceed the player must place the red cup on the blue plate.
The player enters: I put the red cup on the blue plate
Easy right? Not so fast. Our bag of words routine takes that and changes it to: blue cup I plate put red (not necessarily in that order)
Er… which one is blue? The cup or the plate? Since both exist, the system cannot correctly determine what the intent of the player is. So this model won’t work for us.
What about translation engines? They keep track of word order (sort of).
There are different approaches to this. Some use Sequence-to-Sequence models (that work very well for translation) while others use some form of Convolutional models.
These work really well to convert between languages, but they are still not fit for determining our players intent.
So for now, it’s back to the chatbot model which will attempt to classify our player input into a small (relatively) group of possible actions and feed us a response.