Imagine a situation where you would like to filter the words that are considered vulgar from the data we are indexing. Of course, by accident, such words can be found in your data and you don't want them to be searchable, thus you want to ignore them. Can we do that with Solr? Of course we can and this recipe will show you how.
Let's start with the index structure (just add this to your schema.xml
file to the fields section):
<field name="id" type="string" indexed="true" stored="true" required="true" /> <field name="name" type="text_ignored" indexed="true" stored="true" />
The text_ignored
type definition looks like this:
<fieldType name="text_ignored" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="ignored.txt" enablePositionIncrements="true" /> </analyzer> </fieldType>
The...