In the previous chapter, we delved into the setup and working of SolrCloud. We saw the working of distributed indexing and search and how they can be used for handling horizontal scalability and high availability issues in a large-scale Solr deployment. We also discussed the use of SolrCloud as a large-scale NoSQL database.
In this chapter, we will understand what text tagging is and how Lucene and, hence, Solr can be used to implement it in indexing. We will discuss the Finite State Transducer (FST) and the algorithms related to it and learn how it can be integrated with Solr. The topics that we will cover are:
An overview of FST and text tagging
Implementation of FST in Lucene
Text tagging algorithms
Using Solr for text tagging
Implementing a text tagger using Solr