Handling typos with n-grams
Sometimes, there are situations where you would like to have some kind of functionality that allows you to give your user the search results even though he made a typo, perhaps even more than one typo. In Solr, there are multiple ways to do this—use the Spellchecker component and try to correct the user's mistake, use fuzzy queries, or use the n-gram approach. This recipe will concentrate on the third approach and show you how to use n-grams to handle user typos.
How to do it...
For this recipe, let's assume that our index is built of four fields: identifier
, name
, description
, and description_ngram
, which will be processed with the n-gram filter.
So, let's start with the definition of our index structure that can look like this (we will place the following entries in the
schema.xml
file):<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" /> <field name="name" type="text_general" indexed="true" stored="true"/...