Book Image

Apache Solr Search Patterns

By : Jayant Kumar
Book Image

Apache Solr Search Patterns

By: Jayant Kumar

Overview of this book

Table of Contents (17 chapters)
Apache Solr Search Patterns
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Handling unclean data


What do we mean by unclean data? In the last section, we discussed a customer searching for pink sweater, where pink is the color and sweater is the type of clothing. However, the system or the search engine cannot interpret the input in this fashion. Therefore, in our e-commerce schema design earlier, we created a query that searched across all fields available in the index. We then created a separate copyField class to handle search across fields, such as clothes_color, that are not being searched in the default query.

Now, will our query give good results? What if there is a brand named pink? Then what would the results be like? First of all, we would not be sure whether pink is intended to be the color or the brand. Suppose we say that pink is intended to be the color, but we are also searching across brands and it will contain pink as the brand name. The results will be a mix of both clothes_color and brand. In our query, we are boosting brand, so what happens is...