Learn practical skills, build real-world projects, and advance your career
# Jovian Commit Essentials
# Please retain and execute this cell without modifying the contents for `jovian.commit` to work
!pip install jovian --upgrade -q
import jovian
jovian.set_project('sentiment-rnn-solution')
jovian.set_colab_id('1FV8kREe-S4L68iCM9NSRd1cLXoDZhW8i')
|████████████████████████████████| 68 kB 7.4 MB/s eta 0:00:01 Building wheel for uuid (setup.py) ... done

Sentiment Analysis with an RNN

In this notebook, you'll implement a recurrent neural network that performs sentiment analysis.

Using an RNN rather than a strictly feedforward network is more accurate since we can include information about the sequence of words.

Here we'll use a dataset of movie reviews, accompanied by sentiment labels: positive or negative.

alt

Network Architecture

The architecture for this network is shown below.

alt

First, we'll pass in words to an embedding layer. We need an embedding layer because we have tens of thousands of words, so we'll need a more efficient representation for our input data than one-hot encoded vectors. You should have seen this before from the Word2Vec lesson. You can actually train an embedding with the Skip-gram Word2Vec model and use those embeddings as input, here. However, it's good enough to just have an embedding layer and let the network learn a different embedding table on its own. In this case, the embedding layer is for dimensionality reduction, rather than for learning semantic representations.

After input words are passed to an embedding layer, the new embeddings will be passed to LSTM cells. The LSTM cells will add recurrent connections to the network and give us the ability to include information about the sequence of words in the movie review data.

Finally, the LSTM outputs will go to a sigmoid output layer. We're using a sigmoid function because positive and negative = 1 and 0, respectively, and a sigmoid will output predicted, sentiment values between 0-1.

We don't care about the sigmoid outputs except for the very last one; we can ignore the rest. We'll calculate the loss by comparing the output at the last time step and the training label (pos or neg).


Load in and visualize the data

import numpy as np
import pandas as pd

data=pd.read_csv('sample_data/twitter.csv',encoding="ISO-8859-1",names=["target", "ids", "date", "flag", "user", "text"])
data.head()