RNNs and LSTM Networks

Are you interested in creating a chat bot or doing language processing with Deep Learning? This tutorial will show you one of Caffe2’s example Python scripts that you can run out of the box and modify to start you project from using a working Recurrent Neural Network (RNN). This particular RNN is a Long Short Term Memory (LSTM) network, where the network is capable of learning and maintaining a memory overtime while showing gradual improvement. For more information, you might want to and his RNN project, the inspiration for this Caffe2 variation of the code he posted. It goes into further technical detail with some great animations and examples.

What’s fascinating about this script is that you can throw a variety of data sources at, like the works of Shakespeare, and not only will it “learn” English, grammar, and spelling, it will also pick up on the nuances of structure and prose used in his works. Likewise you can feed it speeches from Obama, and while the output might sound like typical political rhetoric from a 3AM C-SPAN filibuster, you’ll spot a familiar cadence and rhythm that you can almost hear in the nearly intelligible words.

Fed source code, it will learn that language’s structure and mimic code that, while may not compile, at first glance seems reasonable.

First you’ll want to as your training data. Save this right in the directory.

Then run the script, passing in the shakespeare.txt file as an argument for —train_data and sit back for a bit.

Output will be something like this initially:

If you have a GPU on your computer then try out the setting for that by adding —gpu.

You may also want to adjust some of the other optional parameters to increase performance or the effect of the training.

—seq_length: this is the number of characters in one continuous sequence that are grabbed from the training data and passed into the network; defaults to 25
—batch_size: defaults to 1
—hidden_size: this is the size of the neural network’s hidden layer; defaults to 100

Even after just a few minutes you can see changes in the output that are getting close to Shakespeare, or English at least, with a sort of screenplay structure. You can see in the output below that we’re at the 184,000 iteration of the network. The loss has dropped from around 50% when we first started, down to 35% quite quickly.

When acquiring data sources for this script, try to get all plain ASCII text rather than UTF-8 or other formats if you want it to run out of the box. Otherwise, you’re going to need to preprocess it or modify the script to handle the different formats.