Book Image

Natural Language Processing with Java and LingPipe Cookbook

Book Image

Natural Language Processing with Java and LingPipe Cookbook

Overview of this book

Table of Contents (14 chapters)
Natural Language Processing with Java and LingPipe Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Marking embedded chunks in a string – sentence chunk example


The method of displaying chunkings in the previous recipes is not well suited for applications that need to modify the underlying string. For example, a sentiment analyzer might want to highlight only sentences that are strongly positive and not mark up the remaining sentences while still displaying the entire text. The slight complication in producing the marked-up text is that adding markups changes the underlying string. This recipe provides working code to insert the chunking by adding chunks in reverse.

How to do it...

While this recipe may not be technically complex it is useful to get span annotations into a text without out having to invent the code from whole cloth. The src/com/lingpipe/coobook/chapter5/WriteSentDetectedChunks class has the referenced code:

  1. The sentence chunking is created as per the first sentence-detection recipe. The following code extracts the chunks as Set<Chunk> and then sorts them by Chunk.LONGEST_MATCH_ORDER_COMPARITOR...