Book Image

Natural Language Processing with Java - Second Edition

By : Richard M. Reese
Book Image

Natural Language Processing with Java - Second Edition

By: Richard M. Reese

Overview of this book

Natural Language Processing (NLP) allows you to take any sentence and identify patterns, special names, company names, and more. The second edition of Natural Language Processing with Java teaches you how to perform language analysis with the help of Java libraries, while constantly gaining insights from the outcomes. You’ll start by understanding how NLP and its various concepts work. Having got to grips with the basics, you’ll explore important tools and libraries in Java for NLP, such as CoreNLP, OpenNLP, Neuroph, and Mallet. You’ll then start performing NLP on different inputs and tasks, such as tokenization, model training, parts-of-speech and parsing trees. You’ll learn about statistical machine translation, summarization, dialog systems, complex searches, supervised and unsupervised NLP, and more. By the end of this book, you’ll have learned more about NLP, neural networks, and various other trained models in Java for enhancing the performance of NLP applications.
Table of Contents (19 chapters)
Title Page
Dedication
Packt Upsell
Contributors
Preface
Index

Using NLP APIs


We will use the OpenNLP and Stanford APIs to demonstrate parsing and the extraction of relation information. LingPipe can also be used, but will not be discussed here. An example of how LingPipe is used to parse biomedical literature can be found at http://alias-i.com/lingpipe-3.9.3/demos/tutorial/medline/read-me.html.

Using OpenNLP

Parsing text is simple using the ParserTool class. Its static parseLine method accepts three arguments and returns a Parser instance. These arguments are as follows:

  • A string containing the text to be parsed
  • A Parser instance
  • An integer specifying how many parses are to be returned

The Parser instance holds the elements of the parse. The parses are returned in order of their probability. To create a Parser instance, we will use the ParserFactory class' create method. This method uses a ParserModel instance that we will create using the en-parser-chunking.bin file.

This process is shown here, in which an input stream for the model file is created using...