Book Image

Solr Cookbook - Third Edition

By : Rafal Kuc
Book Image

Solr Cookbook - Third Edition

By: Rafal Kuc

Overview of this book

Table of Contents (18 chapters)
Solr Cookbook Third Edition
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

Using patterns to replace tokens


Let's assume that we want to search inside user blog posts. We need to prepare a simple search returning only the identifier of the documents that were matched. However, we will want to remove some words because of explicit language. Of course, we can do this using the stop words functionality, but what if we want to know how many documents have their contents censored with compute statistics on. In such a case, we can't use the stop words functionality, we need something more, which means that we need regular expressions. This recipe will show you how to achieve such requirements using Solr and one of its filters.

How to do it...

To achieve our needs, we will use the solr.PatternReplaceFilterFactory filter. Let's assume that we want to remove all the words that start with the word prefix. These are the steps needed:

  1. First, we need to create our index structure, so the fields we add to the schema.xml file are as follows:

    <field name="id" type="string" indexed...