Book Image

Apache Solr Search Patterns

By : Jayant Kumar
Book Image

Apache Solr Search Patterns

By: Jayant Kumar

Overview of this book

Table of Contents (17 chapters)
Apache Solr Search Patterns
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Faceting with hierarchical taxonomy


You will have come across e-commerce sites that show facets in a hierarchy. Let's take a look at www.amazon.com and check how hierarchy is handled there. A search for "shoes" provides the following hierarchy:

Department Shoes -> Men -> Outdoor -> Hiking & Trekking -> Hiking Boots

Hierarchical facets on www.amazon.com

How is this hierarchy built into Solr and how do searches happen on it?

In earlier versions of Solr, this used to be handled by a tokenizer known as solr.PathHierarchyTokenizerFactory. Each document would contain the complete path or hierarchy leading to the document, and searches would show multiple facets for a single document.

For example, the shoes hierarchy we saw earlier can be indexed as:

doc #1 : /dept_shoes/men/outdoor/hiking_trekking/hiking_boots
doc #2 : /dept_shoes/men/work/formals/

The PathHierarchyTokenizerFactory class will break this field, say, into the following tokens:

doc #1 : /dept_shoes, /dept_shoes/men, /dept_shoes...