Book Image

ElasticSearch Cookbook

By : Alberto Paro
Book Image

ElasticSearch Cookbook

By: Alberto Paro

Overview of this book

ElasticSearch is one of the most promising NoSQL technologies available and is built to provide a scalable search solution with built-in support for near real-time search and multi-tenancy. This practical guide is a complete reference for using ElasticSearch and covers 360 degrees of the ElasticSearch ecosystem. We will get started by showing you how to choose the correct transport layer, communicate with the server, and create custom internal actions for boosting tailored needs. Starting with the basics of the ElasticSearch architecture and how to efficiently index, search, and execute analytics on it, you will learn how to extend ElasticSearch by scripting and monitoring its behaviour. Step-by-step, this book will help you to improve your ability to manage data in indexing with more tailored mappings, along with searching and executing analytics with facets. The topics explored in the book also cover how to integrate ElasticSearch with Python and Java applications. This comprehensive guide will allow you to master storing, searching, and analyzing data with ElasticSearch.
Table of Contents (19 chapters)
ElasticSearch Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Executing a scroll/scan search


The standard query works very well if you need to provide results in which documents do not change too often. Otherwise, doing pagination with live data brings a strange behavior to the returned results. To bypass this problem, ElasticSearch provides an extra parameter in the query: the scroll.

Getting ready

You need a working ElasticSearch cluster and a working copy of Maven.

The code of this recipe is in chapter_10/nativeclient in the code bundle of this book available on Packt's website and the referred class is ScrollScanQueryExample.

How to do it...

The search is done as in the previous recipe. The big difference is a setScroll timeout, which allows storing in memory the resultant IDs for a query for a defined timeout.

We can change the code of the previous recipe by using scroll in the following way:

import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.action.search.SearchType;
import org.elasticsearch.client.Client;
import org.elasticsearch...