Book Image

Elasticsearch 7 Quick Start Guide

By : Anurag Srivastava, Douglas Miller
Book Image

Elasticsearch 7 Quick Start Guide

By: Anurag Srivastava, Douglas Miller

Overview of this book

Elasticsearch is one of the most popular tools for distributed search and analytics. This Elasticsearch book highlights the latest features of Elasticsearch 7 and helps you understand how you can use them to build your own search applications with ease. Starting with an introduction to the Elastic Stack, this book will help you quickly get up to speed with using Elasticsearch. You'll learn how to install, configure, manage, secure, and deploy Elasticsearch clusters, as well as how to use your deployment to develop powerful search and analytics solutions. As you progress, you'll also understand how to troubleshoot any issues that you may encounter along the way. Finally, the book will help you explore the inner workings of Elasticsearch and gain insights into queries, analyzers, mappings, and aggregations as you learn to work with search results. By the end of this book, you'll have a basic understanding of how to build and deploy effective search and analytics solutions using Elasticsearch.
Table of Contents (10 chapters)

Data sparsity

In previous versions of Elasticsearch, the sparsity of documents was to be avoided because of Lucene's structure. This structure identifies documents internally with document IDs, which are then used for communication between the internal APIs of Lucene. Lucene retrieves values of the norm from the document ID, generated by a search query, by reading the byte at the index of the document ID.

Lucene is a full-featured text search engine that is written in Java, and Elasticsearch is built on top of Lucene.

This is, at the same time, both very efficient and time-intensive, because Lucene can quickly access the norm values and the documents that have no value and use one byte of storage for each. This means, though, that if an index has x documents, the norms require x bytes of storage per field. This not only affects the sparsity requirements, but also the indexing...