Book Image

Elasticsearch 8.x Cookbook - Fifth Edition

By : Alberto Paro
Book Image

Elasticsearch 8.x Cookbook - Fifth Edition

By: Alberto Paro

Overview of this book

Elasticsearch is a Lucene-based distributed search engine at the heart of the Elastic Stack that allows you to index and search unstructured content with petabytes of data. With this updated fifth edition, you'll cover comprehensive recipes relating to what's new in Elasticsearch 8.x and see how to create and run complex queries and analytics. The recipes will guide you through performing index mapping, aggregation, working with queries, and scripting using Elasticsearch. You'll focus on numerous solutions and quick techniques for performing both common and uncommon tasks such as deploying Elasticsearch nodes, using the ingest module, working with X-Pack, and creating different visualizations. As you advance, you'll learn how to manage various clusters, restore data, and install Kibana to monitor a cluster and extend it using a variety of plugins. Furthermore, you'll understand how to integrate your Java, Scala, Python, and big data applications such as Apache Spark and Pig with Elasticsearch and create efficient data applications powered by enhanced functionalities and custom plugins. By the end of this Elasticsearch cookbook, you'll have gained in-depth knowledge of implementing the Elasticsearch architecture and be able to manage, search, and store data efficiently and effectively using Elasticsearch.
Table of Contents (20 chapters)

Mapping a GeoShape field

An extension of the concept of a point is its shape. Elasticsearch provides a type that allows you to manage arbitrary polygons in GeoShape.

Getting ready

You will need an up-and-running Elasticsearch installation, as we described in the Downloading and installing Elasticsearch recipe of Chapter 1, Getting Started.

To be able to use advanced shape management, Elasticsearch requires two JAR libraries in its classpath (usually the lib directory), as follows:

  • Spatial4J (v0.3)
  • JTS (v1.13)

How to do it…

To map a geo_shape type, a user must explicitly provide some parameters:

  • tree (the default is geohash): This is the name of the PrefixTree implementation – GeohashPrefixTree and quadtree for QuadPrefixTree.
  • precision: This is used instead of tree_levels to provide a more human value to be used in the tree level. The precision number can be followed by the unit; that is, 10 m, 10 km, 10 miles, and so on.
  • tree_levels: This is the maximum number of layers to be used in the prefix tree.
  • distance_error_pct: This sets the maximum errors that are allowed in a prefix tree (0,025% - max 0,5% by default).

The customer_location mapping, which we saw in the previous recipe using geo_shape, will be as follows:

"customer_location": {
  "type": "geo_shape",
  "tree": "quadtree",
  "precision": "1m" },

How it works…

When a shape is indexed or searched internally, a path tree is created and used.

A path tree is a list of terms that contain geographic information and are computed to improve performance in evaluating geo calculus.

The path tree also depends on the shape's type: point, linestring, polygon, multipoint, or multipolygon.

See also

To understand the logic behind the GeoShape, some good resources are the Elasticsearch page, which tells you about GeoShape, and the sites of the libraries that are used for geographic calculus (https://github.com/spatial4j/spatial4j and http://central.maven.org/maven2/com/vividsolutions/jts/1.13/, respectively).