Book Image

ElasticSearch Server

Book Image

ElasticSearch Server

Overview of this book

ElasticSearch is an open source search server built on Apache Lucene. It was built to provide a scalable search solution with built-in support for near real-time search and multi-tenancy.Jumping into the world of ElasticSearch by setting up your own custom cluster, this book will show you how to create a fast, scalable, and flexible search solution. By learning the ins-and-outs of data indexing and analysis, "ElasticSearch Server" will start you on your journey to mastering the powerful capabilities of ElasticSearch. With practical chapters covering how to search data, extend your search, and go deep into cluster administration and search analysis, this book is perfect for those new and experienced with search servers.In "ElasticSearch Server" you will learn how to revolutionize your website or application with faster, more accurate, and flexible search functionality. Starting with chapters on setting up your own ElasticSearch cluster and searching and extending your search parameters you will quickly be able to create a fast, scalable, and completely custom search solution.Building on your knowledge further you will learn about ElasticSearch's query API and become confident using powerful filtering and faceting capabilities. You will develop practical knowledge on how to make use of ElasticSearch's near real-time capabilities and support for multi-tenancy.Your journey then concludes with chapters that help you monitor and tune your ElasticSearch cluster as well as advanced topics such as shard allocation, gateway configuration, and the discovery module.
Table of Contents (17 chapters)
ElasticSearch Server
Credits
About the Authors
Acknowledgement
Acknowledgement
About the Reviewers
www.PacktPub.com
Preface
Index

Why is the result on later pages slow


Let's imagine that we have an index with several millions of documents. We already know how to build our query, when to use filters, and so on. But looking at query logs, we see that particular kinds of queries are significantly slower than the other ones. These queries may be using paging. The from parameter indicates that the offsets have large values. From the application side, this can mean that users go through an enormous number of results. Often this doesn't make sense—if a user doesn't find desirable results on first few pages, he/she gives up. Because this particular activity can mean something bad (possible data theft), many applications limit paging to dozens of pages. In our case, we assume that this is a different scenario and we have to provide this functionality.

What is the problem?

When ElasticSearch generates a response, it must determine the order of documents forming the result. If we are on the first page, this is not a big problem...