Book Image

Lucene 4 Cookbook

By : Edwood Ng, Vineeth Mohan
Book Image

Lucene 4 Cookbook

By: Edwood Ng, Vineeth Mohan

Overview of this book

Table of Contents (16 chapters)
Lucene 4 Cookbook
About the Authors
About the Reviewers

Obtaining TokenAttribute values

With a TokenStream, we can look at how token values are retrieved. From a high level, TokenStream is an enumeration of tokens. To access the values, we will provide TokenStream with one or more attribute objects. Note that there is only one instance that exists per attribute. This is for performance reasons so we are not creating objects in each iteration; instead, the same attribute instances are updated when we increment the token.

Getting ready

There are several types of attributes; each type provides a different aspect, or metadata, of a token. Here is a list of attributes we will review in this section.

This is the token attribute interface description:

  • CharTermAttribute: This exposes a token's actual textual value, equivalent to a term's value.

  • PositionIncrementAttribute: This returns the position of the current token relative to the previous token. This attribute is useful in phrase-matching as the keyword order and their positions are important. If there...