Imagine a situation where you would like to filter the words that are considered vulgar from the data we are indexing. Of course, by accident, such words can be found in your data and you don't want them to be searchable thus you want to ignore them. Can we do that with Solr? Of course we can, and this recipe will show you how to do that.
Let's start with the following index structure (just add this to your
schema.xml
file, to thefield
section):<field name="id" type="string" indexed="true" stored="true" required="true" /> <field name="name" type="text_ignored" indexed="true" stored="true" />
The second step is to define the
text_ignored
type, which looks like the following code:<fieldType name="text_ignored" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="ignored.txt" enablePositionIncrements...