Before training the model using the pre-processed data, let's understand the model definition for this problem. In the code, we define a model class in the model.py
file. The class contains four major components, and are as follows:
- Input: We define the TensorFlow placeholders in the model for both input (X) and target (Y).
- Network definition: There are four components of the network for this model. They are as follows:
- Initializing the LSTM cell: To do this, we begin by stacking two layers of LSTMs together. We then set the size of the LSTM to be a
RNN_SIZE
parameter, which is as defined in the code. RNN is then initialized with a zero state. - Word embeddings: We encode the words in the text using word embeddings rather than one-hot encoding. This is done, mainly, to reduce the dimension of the training set, which can help neural networks learn faster. We generate embeddings from a uniform distribution for each word in the vocabulary and use TensorFlow's
embedding_lookup...
- Initializing the LSTM cell: To do this, we begin by stacking two layers of LSTMs together. We then set the size of the LSTM to be a