Book Image

Apache Solr Search Patterns

By : Jayant Kumar
Book Image

Apache Solr Search Patterns

By: Jayant Kumar

Overview of this book

Table of Contents (17 chapters)
Apache Solr Search Patterns
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Preface

Apache Solr is the most widely used full text search solution. Almost all the websites today use Solr to provide the search function. Development of the search feature with a basic Solr setup is the starting point. At a later stage, most developers find it imperative to delve into Solr to provide solutions to certain problems or add specific features. This book will provide a developer working on Solr with a deeper insight into Solr. The book will also provide strategies and concepts that are employed in the development of different solutions using Solr. You will not only learn how to tweak Solr, but will also understand how to use it to handle big data and solve scalability problems.

What this book covers

Chapter 1, Solr Indexing Internals, delves into how indexing happens in Solr and how analyzers and tokenizers work during index creation.

Chapter 2, Customizing the Solr Scoring Algorithm, discusses different scoring algorithms in Solr and how to tweak these algorithms and implement them in Solr.

Chapter 3, Solr Internals and Custom Queries, discusses in-depth how relevance calculation happens and how scorers and filters work internally in Solr. This chapter will outline how to create custom plugins in Solr.

Chapter 4, Solr for Big Data, focuses on churning out big data for analysis purposes, including various faceting concepts and tools that can be used with Solr in order to plot graphs and charts.

Chapter 5, Solr in E-commerce, discusses the problems faced during the implementation of Solr in an e-commerce website and the related strategies and solutions.

Chapter 6, Solr for Spatial Search, focuses on spatial capabilities that the current and previous Solr versions possess. This chapter will also cover important concepts such as indexing and searching or filtering strategies together with varied query types that are available with a spatial search.

Chapter 7, Using Solr in an Advertising System, discusses the problems faced during the implementation of Solr to search in an advertising system and the related strategies and solutions.

Chapter 8, AJAX Solr, focuses on an AJAX Solr feature that helps reduce dependency on the application. This chapter will also cover an in-depth understanding of AJAX Solr as a framework and its implementation.

Chapter 9, SolrCloud, provides the complete procedure to implement SolrCloud and examines the benefits of using a distributed search with SolrCloud.

Chapter 10, Text Tagging with Lucene FST, focuses on the basic understanding of an FST and its implementation and guides us in designing an algorithm for text tagging, which can be implemented using FSTs and further integrated with Solr.

What you need for this book

You will need a Windows or Linux machine with Apache configured to run the web server. You will also need the Java Development Kit (JDK) installed together with an editor to write Java programs. You will need Solr 4.8 or higher to understand the procedures.

Who this book is for

Basic knowledge of working with Solr is required to understand the advanced topics discussed in this book. An understanding of Java programming concepts is required to study the programs discussed in this book.

Conventions

In this book, you will find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meanings.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "The mergeFactor class controls how many segments a Lucene index is allowed to have before it is coalesced into one segment."

A block of code is set as follows:

//Create collection of documents to add to Solr server
SolrInputDocument doc1 = new SolrInputDocument();
document.addField("id",1);
document.addField("desc", "description text for doc 1");

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

//Create collection of documents to add to Solr server
SolrInputDocument doc1 = new SolrInputDocument();
document.addField("id",1);
document.addField("desc", "description text for doc 1");

Any command-line input or output is written as follows:

java -jar post.jar *.xml

New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes, appear as follows: "In the index, we can see that the token Harry appears in both documents."

Note

Warnings or important notes appear in a box like this.

Tip

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.

To send us general feedback, simply send an e-mail to , and mention the book title on the subject line of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account from http://www.packtpub.com. If you purchase this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Errata

Although we have taken care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in any of our books—maybe a mistake in the text or the code—we would be grateful if you reported this to us. By doing so, you can save other readers from frustration and help us improve the subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded on our website, or added to any list of existing errata, under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at with a link to the suspected pirated material.

We appreciate your help in protecting our authors, and our ability to bring you valuable content.

Questions

You can contact us at if you are having a problem with any aspect of the book, and we will do our best to address it.