One of the most common requirements I meet is
stemming – the process of reducing the word to their root form (or stems). Let's imagine the book e-commerce store, where you store the books' names and descriptions. We want to be able to find words such as shown
or showed
when you type the word show
and vice versa. To achieve that we can use stemming algorithms. This recipe will show you how to add stemming to your data analysis.
We need to start with the index structure. Let's assume that our index consists of three fields (add this to your
schema.xml
file to the field definition section):<field name="id" type="string" indexed="true" stored="true" required="true" /> <field name="name" type="text" indexed="true" stored="true" /> <field name="description" type="text_stem" indexed="true" stored="true" />
Now let's define our
text_stem
type which should look like the following code:<fieldType name="text_stem" class="solr.TextField"> <...