As you probably already know, Solr supports UTF-8 encoding and thus can handle data in many languages. But, if you ever needed to sort some languages that have characters specific to them you probably know that it doesn't work well on a standard Solr string
type. This recipe will show you how to deal with sorting in Solr.
These steps tell us how to sort non-English languages properly:
For the purpose of this recipe, I have assumed that we will have to sort text that contains Polish characters. To show the good and bad sorting behaviour we need to create the following index structure (add this to your
schema.xml
file):<field name="id" type="string" indexed="true" stored="true" required="true" /> <field name="name" type="text" indexed="true" stored="true" /> <field name="name_sort_bad" type="string" indexed="true" stored="true" /> <field name="name_sort_good" type="text_sort" indexed="true" stored="true" />
Now...