Book Image

Natural Language Processing with Java and LingPipe Cookbook

Book Image

Natural Language Processing with Java and LingPipe Cookbook

Overview of this book

Table of Contents (14 chapters)
Natural Language Processing with Java and LingPipe Cookbook
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Combining feature extractors


Feature extractors can be combined in much the same way as tokenizers in Chapter 2, Finding and Working with Words.

How to do it…

This recipe will show you how to combine the feature extractor from the previous recipe with a very common feature extractor over character ngrams.

  1. We will start with a main() method in src/com/lingpipe/cookbook/chapter3/CombinedFeatureExtractor.java that we will use to run the feature extractor. The following lines set up features that result from the tokenizer using the LingPipe class, TokenFeatureExtractor:

    public static void main(String[] args) {
       int min = 2;
      int max = 4;
      TokenizerFactory tokenizerFactory 
         = new NGramTokenizerFactory(min,max);
      FeatureExtractor<CharSequence> tokenFeatures 
    = new TokenFeatureExtractor(tokenizerFactory);
  2. Then, we will construct the feature extractor from the previous recipe.

    FeatureExtractor<CharSequence> numberFeatures 
    = new ContainsNumberFeatureExtractor();
  3. Next, the LingPipe...