Book Image

Elasticsearch 8.x Cookbook - Fifth Edition

By : Alberto Paro
Book Image

Elasticsearch 8.x Cookbook - Fifth Edition

By: Alberto Paro

Overview of this book

Elasticsearch is a Lucene-based distributed search engine at the heart of the Elastic Stack that allows you to index and search unstructured content with petabytes of data. With this updated fifth edition, you'll cover comprehensive recipes relating to what's new in Elasticsearch 8.x and see how to create and run complex queries and analytics. The recipes will guide you through performing index mapping, aggregation, working with queries, and scripting using Elasticsearch. You'll focus on numerous solutions and quick techniques for performing both common and uncommon tasks such as deploying Elasticsearch nodes, using the ingest module, working with X-Pack, and creating different visualizations. As you advance, you'll learn how to manage various clusters, restore data, and install Kibana to monitor a cluster and extend it using a variety of plugins. Furthermore, you'll understand how to integrate your Java, Scala, Python, and big data applications such as Apache Spark and Pig with Elasticsearch and create efficient data applications powered by enhanced functionalities and custom plugins. By the end of this Elasticsearch cookbook, you'll have gained in-depth knowledge of implementing the Elasticsearch architecture and be able to manage, search, and store data efficiently and effectively using Elasticsearch.
Table of Contents (20 chapters)

Using explicit mapping creation

If we consider the index as a database in the SQL world, mapping is similar to the create table definition.

Elasticsearch can understand the structure of the document that you are indexing (reflection) and create the mapping definition automatically. This is called explicit mapping creation.

Getting ready

To execute the code in this recipe, you will need an up-and-running Elasticsearch installation, as described in the Downloading and installing Elasticsearch recipe of Chapter 1, Getting Started.

To execute these commands, you can use any HTTP client, such as curl (https://curl.haxx.se/), Postman (https://www.getpostman.com/), or similar platforms. I suggest using the Kibana console to provide code completion and better character escaping for Elasticsearch.

To understand the examples and code in this recipe, basic knowledge of JSON is required.

How to do it…

You can explicitly create a mapping by adding a new document to Elasticsearch. For this, perform the following steps:

  1. Create an index, as shown in the following code:
    PUT test

The output will be as follows:

{ "acknowledged" : true, "shards_acknowledged" : true,
 "index" : "test" }
  1. Put a document in the index, as shown in the following code:
    PUT test/_doc/1
    {"name":"Paul", "age":35}

The output will be as follows:

{
  "_index" : "test", "_id" : "1", "_version" : 1,
  "result" : "created",
  "_shards" : {"total" : 2, "successful" : 1, "failed" : 0 },
  "_seq_no" : 0,  "_primary_term" : 1
}
  1. Get the mapping with the following code:
    GET test/_mapping
  2. The mapping that's auto-created by Elasticsearch should look as follows:
    {
      "test" : {
        "mappings" : {
          "properties" : {
            "age" : { "type" : "long" },
            "name" : {
              "type" : "text",
              "fields" : {
                "keyword" : {"type" : "keyword", "ignore_above" : 256 }
    } } } } } }
  3. To delete the index, you can use the following command:
    DELETE test

The output will be as follows:

{ "acknowledged" : true }

How it works…

The first command line (Step 1) creates an index where we can configure the mappings in the future, if required, and store documents in it.

The second command (Step 2) inserts a document in the index (we'll learn how to create the index in the Creating an index recipe of Chapter 3, Basic Operations, and record indexing in the Indexing a document recipe of Chapter 3, Basic Operations).

Elasticsearch reads all the default properties for the field of the mapping and starts to process them as follows:

  • If the field is already present in the mapping and the value of the field is valid (it matches the correct type), Elasticsearch does not need to change the current mappings.
  • If the field is already present in the mapping but the value of the field is of a different type, it tries to upgrade the field type (that is, from integer to long). If the types are not compatible, it throws an exception, and the indexing process fails.
  • If the field is not present, it tries to auto-detect the type of field. It updates the mappings with a new field mapping. (In the case of a null value, it skips the mapping update until it encounters a concrete type.)

There's more…

In Elasticsearch, every document has a unique identifier, called an ID for a single index, which is stored in the special _id field of the document.

The _id field can be provided at index time or can be assigned automatically by Elasticsearch if it is missing.

When a mapping type is created or changed, Elasticsearch automatically propagates mapping changes to all the nodes in the cluster so that all the shards are aligned to process that particular type.

In Elasticsearch 7.x, there was a default type (_doc): it was removed in Elasticsearch 8.x and above.

See also

Please refer to the following recipes in Chapter 3Basic Operations:

  • The Creating an index recipe, which is about putting new mappings in an index while it's being created
  • The Putting a mapping in an index recipe, which is about extending a mapping in an index