Book Image

Advanced Natural Language Processing with TensorFlow 2

By : Ashish Bansal, Tony Mullen
Book Image

Advanced Natural Language Processing with TensorFlow 2

By: Ashish Bansal, Tony Mullen

Overview of this book

Recently, there have been tremendous advances in NLP, and we are now moving from research labs into practical applications. This book comes with a perfect blend of both the theoretical and practical aspects of trending and complex NLP techniques. The book is focused on innovative applications in the field of NLP, language generation, and dialogue systems. It helps you apply the concepts of pre-processing text using techniques such as tokenization, parts of speech tagging, and lemmatization using popular libraries such as Stanford NLP and SpaCy. You will build Named Entity Recognition (NER) from scratch using Conditional Random Fields and Viterbi Decoding on top of RNNs. The book covers key emerging areas such as generating text for use in sentence completion and text summarization, bridging images and text by generating captions for images, and managing dialogue aspects of chatbots. You will learn how to apply transfer learning and fine-tuning using TensorFlow 2. Further, it covers practical techniques that can simplify the labelling of textual data. The book also has a working code that is adaptable to your use cases for each tech piece. By the end of the book, you will have an advanced knowledge of the tools, techniques and deep learning architecture used to solve complex NLP problems.
Table of Contents (13 chapters)
11
Other Books You May Enjoy
12
Index

To get the most out of this book

  • It would be a good idea to get a background on the basics of deep learning models and TensorFlow.
  • The use of a GPU is highly recommended. Some of the models, especially in the later chapters, are pretty big and complex. They may take hours or days to fully train on CPUs. RNNs are very slow to train without the use of GPUs. You can get access to free GPUs on Google Colab, and instructions for doing so are provided in the first chapter.

Download the example code files

The code bundle for the book is hosted on GitHub at https://github.com/PacktPublishing/Advanced-Natural-Language-Processing-with-TensorFlow-2. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Download the color images

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781800200937_ColorImages.pdf.

Conventions used

There are a number of text conventions used throughout this book.

CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. For example: "In the num_capitals() function, substitutions are performed for the capital letters in English."

A block of code is set as follows:

en = snlp.Pipeline(lang='en')
def word_counts(x, pipeline=en):
  doc = pipeline(x)
  count = sum([len(sentence.tokens) for sentence in doc.sentences])
  return count

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

en = snlp.Pipeline(lang='en')
def word_counts(x, pipeline=en):
  doc = pipeline(x)
  count = sum([len(sentence.tokens) for sentence in doc.sentences])
  return count

Any command-line input or output is written as follows:

!pip install gensim

Bold: Indicates a new term, an important word, or words that you see on the screen, for example, in menus or dialog boxes, also appear in the text like this. For example: "Select System info from the Administration panel."

Warnings or important notes appear like this.

Tips and tricks appear like this.