Book Image

Apache Solr High Performance

By : Surendra Mohan
Book Image

Apache Solr High Performance

By: Surendra Mohan

Overview of this book

Table of Contents (14 chapters)

Ignore the defined words from being searched


Imagine a situation where you wish to filter out offensive words from the indexed data. Such words need to be ignored and shouldn't be searchable. Can we provide such a capability to Solr? Yes, of course; we can do that and we will understand how to do it in this section.

In order to avoid using offensive words in the demonstration, we will use the term offensive, which denotes any offensive word we would like to filter out from being searched.

In order to start, we will define the following index structure in the fields section of our schema.xml file:

<field name="o_id" type="string" indexed="true" stored="true" required="true" />
<field name="o_name" type="text_offensive" indexed="true" stored="true" />

Now, let us define the text_offensive field type in the types section of our schema.xml file as follows:

<fieldType name="text_offensive" class="solr.TextField" positionIncrementGap="100">
  <analyzer>
    <tokenizer class...