A Recurrent Neural Network (RNN) is a very powerful and interesting model actively being worked on in deep learning research. RNNs have already proven to be useful in sentiment analysis, speech recognition, machine translation and many other tasks. Using my Facebook data (Tutorial), I decided to see if a RNN can learn to model the way I communicate online.
For the purposes of language modelling, long short-term memory networks (LSTMs) have been very successful in recent work. If you aren’t familiar with LSTMs, I recommend you to read this post. For the purposes of this project, I will be using torch-rnn, a RNN and LSTM module for character-level language modelling written for Torch.
For training data, I selected my historical Facebook messages based on the following simple criterion: each word in the message must have been used at least 6 times (over all of my messages). This cleaning process eliminates messages with typos, URLs and uncommon words. Afterwards, I concatenated all my messages in a text file, using “.” as a delimiter between messages (this works only because nearly all of my messages don’t end in periods). This results in a 2.5 MB text file.
Using this data, I trained a 2-layer, 512-hidden node, LSTM with dropout set to 0.5 for 50 epochs. Then, I sampled sequences of 100 characters in length and extracted complete messages (phrases separated by a delimiter). Here are some of the generated messages:
okay then this is really.
yea im not even gonna look at away.
so you can almost beast.
lol oh wow.
ill eat that much.
okay it seems to be this course online
thats a bad good page.
it actually looks pretty sick.
yo you dont tell him we need to think lol.
Considering the LSTM is character-level, it has done a great job in learning the spelling of many English words. Furthermore, the LSTM has learned pretty solid grammar (keep in mind that the training data doesn’t have many complete sentences). That being said, not all of the generated messages make complete sense, such as: “thats a bad good page”. In terms of sounding like me, I would say that the LSTM has successfully mimicked some of the phrases and patterns that I say. However, it definitely needs more training data to improve comprehension and fix some grammatical problems.
Overall, it was fun to experiment with my own Facebook data and play with an LSTM. I will probably try out other generative models in the future to see if other cool results can be achieved. Also, the code used for this project will be on my GitHub if I get around to cleaning it up.
Credits to Chris Olah for the LSTM image.