Imagine you have a powerful preprocessing tool that can extract information about all the words in the text. Your boss would like you to use it with Solr or at least store the information it returns in Solr. So what can you do? We can use something called payload to store that data. This recipe will show you how to do it.
I assume that we already have an application that takes care of recognizing the part of speech in our text data. What we need to add is the data to the Solr index. To do that we will use a payload – a metadata that can be stored with each occurrence of a term.
First of all, you need to modify the index structure. To do this, we will add the new field type to the
schema.xml
file (the following entries should be added to thetypes
section):<fieldtype name="partofspeech" class="solr.TextField"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.DelimitedPayloadTokenFilterFactory...