Book Image

Apache Solr Search Patterns

By : Jayant Kumar
Book Image

Apache Solr Search Patterns

By: Jayant Kumar

Overview of this book

Table of Contents (17 chapters)
Apache Solr Search Patterns
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Text tagging algorithms


The process of text tagging can be explained by the following figure:

A document is tokenized and the tokens are passed to the naive tagger. The naive tagger uses a tagging algorithm to find the tags. Then, the geo-coordinate finder identifies the geo-locations (lat-long coordinates) corresponding to those tags. They are then available as the output.

There are various text tagging algorithms, each of which has its own benefits. Let us go through some of the algorithms that can be used for text tagging.

Fuzzy string matching algorithm

The fuzzy string matching algorithm can be used to match two strings, exactly or partially. This means the relationship is fuzzy when there is a set of n-elements and another set of m-elements, and both partially match the same elements. Using this algorithm, we can identify strings that are similar to a set of other strings. It is like drawing similar terms from the string.

Suppose we want to find the similarity between two words, say jumps...