We will work on the problem of text summarization to create relevant summaries for product reviews about fine food sold on the world's largest e-commerce platform, Amazon. Reviews include product and user information, ratings, and a plain text review. It also includes reviews from all other Amazon categories. We develop a basic character-level sequence-to-sequence (seq2seq) model by defining an encoder-decoder recurrent neural network (RNN) architecture.
Our dataset includes the following:
- 568,454 reviews
- 256,059 users
- 74,258 products
Note
The dataset used in this recipe can be found at https://www.kaggle.com/snap/amazon-fine-food-reviews/.
In this recipe, we develop a modeling pipeline and encoder-decoder architecture that try to create relevant summaries for a given set of reviews. The modeling pipelines use RNN models written using the Keras functional API. The pipelines also use various data manipulation libraries.
The encoder-decoder architecture...