The previous two recipes in this chapter detected tokens (words) and sentences using legacy Java classes and methods in them. In this recipe, we will combine the two tasks of detecting tokens and sentences with an open-source library of Apache named OpenNLP. The reason for introducing OpenNLP with these two tasks that can be accomplished well with the legacy methods is to introduce data scientists to a tool that is really handy and has very high accuracy in several information retrieval tasks on standard and classic corpora. The homepage for OpenNLP can be found at https://opennlp.apache.org/. One strong argument of using this library for tokenization, sentence segmentation, part-of-speech tagging, named entity recognition, chunking, parsing, and co-reference resolution is that you can have your own classifier trained on your corpora of articles or documents.
Java Data Science Cookbook
By :
Java Data Science Cookbook
By:
Overview of this book
If you are looking to build data science
models that are good for production,
Java has come to the rescue. With the aid
of strong libraries such as MLlib, Weka,
DL4j, and more, you can efficiently
perform all the data science tasks you
need to.
This unique book provides modern
recipes to solve your common and
not-so-common data science-related
problems. We start with recipes to help
you obtain, clean, index, and search data.
Then you will learn a variety of techniques
to analyze, learn from, and retrieve
information from data. You will also
understand how to handle big data, learn
deeply from data, and visualize data.
Finally, you will work through unique
recipes that solve your problems while
taking data science to production, writing
distributed data science applications,
and much more - things that will come in
handy at work.
Table of Contents (16 chapters)
Java Data Science Cookbook
Credits
About the Author
About the Reviewer
www.PacktPub.com
Customer Feedback
Preface
Free Chapter
Obtaining and Cleaning Data
Indexing and Searching Data
Analyzing Data Statistically
Learning from Data - Part 1
Learning from Data - Part 2
Retrieving Information from Text Data
Handling Big Data
Learn Deeply from Data
Customer Reviews