Book Image

Solr Cookbook - Third Edition

By : Rafal Kuc
Book Image

Solr Cookbook - Third Edition

By: Rafal Kuc

Overview of this book

Table of Contents (18 chapters)
Solr Cookbook Third Edition
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

Stemming different languages


Stemming is a very common requirement; it is the process of reducing words to their root form (or stems). Let's imagine the book e-commerce store, where you store the books' names and descriptions. We want to be able to find words such as shown and showed when you type the word show, and vice versa. We can achieve this requirement using stemming algorithms. The thing is, there are no general stemmers; they are language-specific. This recipe will show you how to add stemming to your data analysis chain and where to look for a list of stemmers.

How to do it...

To achieve our requirement to stem English, we need to take certain steps:

  1. We will start with the index structure. Let's assume that our index consists of three fields that we defined in the schema.xml file:

    <field name="id" type="string" indexed="true" stored="true" required="true" />
    <field name="name" type="string" indexed="true" stored="true" />
    <field name="description" type="text_stem" indexed...