Book Image

The Deep Learning Workshop

By : Mirza Rahim Baig, Thomas V. Joseph, Nipun Sadvilkar, Mohan Kumar Silaparasetty, Anthony So
Book Image

The Deep Learning Workshop

By: Mirza Rahim Baig, Thomas V. Joseph, Nipun Sadvilkar, Mohan Kumar Silaparasetty, Anthony So

Overview of this book

Are you fascinated by how deep learning powers intelligent applications such as self-driving cars, virtual assistants, facial recognition devices, and chatbots to process data and solve complex problems? Whether you are familiar with machine learning or are new to this domain, The Deep Learning Workshop will make it easy for you to understand deep learning with the help of interesting examples and exercises throughout. The book starts by highlighting the relationship between deep learning, machine learning, and artificial intelligence and helps you get comfortable with the TensorFlow 2.0 programming structure using hands-on exercises. You’ll understand neural networks, the structure of a perceptron, and how to use TensorFlow to create and train models. The book will then let you explore the fundamentals of computer vision by performing image recognition exercises with convolutional neural networks (CNNs) using Keras. As you advance, you’ll be able to make your model more powerful by implementing text embedding and sequencing the data using popular deep learning solutions. Finally, you’ll get to grips with bidirectional recurrent neural networks (RNNs) and build generative adversarial networks (GANs) for image synthesis. By the end of this deep learning book, you’ll have learned the skills essential for building deep learning models with TensorFlow and Keras.
Table of Contents (9 chapters)
Preface

4. Deep Learning for Text – Embeddings

Activity 4.01: Text Preprocessing of the 'Alice in Wonderland' Text

Solution

You need to perform the following steps:

Note

Before commencing this activity, make sure you have defined the alice_raw variable as demonstrated in the section titled Downloading Text Corpora Using NLTK.

  1. Change the data to lowercase and separate into sentences:
    txt_sents = tokenize.sent_tokenize(alice_raw.lower())
  2. Tokenize the sentences:
    txt_words = [tokenize.word_tokenize(sent) for sent in txt_sents]
  3. Import punctuation from the string module and stopwords from NLTK:
    from string import punctuation
    stop_punct = list(punctuation)
    from nltk.corpus import stopwords
    stop_nltk = stopwords.words("english")
  4. Create a variable holding the contextual stop words -- and said:
    stop_context = ["--", "said"]
  5. Create a master list for the stop words to remove words that contain terms from punctuation, NLTK stop...