Book Image

Elasticsearch 7.0 Cookbook - Fourth Edition

By : Alberto Paro
Book Image

Elasticsearch 7.0 Cookbook - Fourth Edition

By: Alberto Paro

Overview of this book

Elasticsearch is a Lucene-based distributed search server that allows users to index and search unstructured content with petabytes of data. With this book, you'll be guided through comprehensive recipes on what's new in Elasticsearch 7, and see how to create and run complex queries and analytics. Packed with recipes on performing index mapping, aggregation, and scripting using Elasticsearch, this fourth edition of Elasticsearch Cookbook will get you acquainted with numerous solutions and quick techniques for performing both every day and uncommon tasks such as deploying Elasticsearch nodes, integrating other tools to Elasticsearch, and creating different visualizations. You will install Kibana to monitor a cluster and also extend it using a variety of plugins. Finally, you will integrate your Java, Scala, Python, and big data applications such as Apache Spark and Pig with Elasticsearch, and create efficient data applications powered by enhanced functionalities and custom plugins. By the end of this book, you will have gained in-depth knowledge of implementing Elasticsearch architecture, and you'll be able to manage, search, and store data efficiently and effectively using Elasticsearch.
Table of Contents (23 chapters)
Title Page

Setting up networking

Correctly setting up networking is very important for your nodes and cluster.

There are a lot of different installation scenarios and networking issues. The first step for configuring the nodes to build a cluster is to correctly set the node discovery.

Getting ready

To change configuration files, you will need a working Elasticsearch installation and a simple text editor, as well as your current networking configuration (your IP).

How to do it…

To setup the networking, use the following steps:

  1. Using a standard Elasticsearch configuration config/elasticsearch.yml file, your node will be configured to bind on the localhost interface (by default) so that it can't be accessed by external machines or nodes.
  2. To allow another machine to connect to our node, we need to set network.host to our IP (for example, I have 192.168.1.164).
  3. To be able to discover other nodes, we need to list them in the discovery.zen.ping.unicast.hosts parameter. This means that it sends signals to the machine in a unicast list and waits for a response. If a node responds to it, they can join in a cluster.
  1. In general, from Elasticsearch version 6.x, the node versions are compatible. You must have the same cluster name (the cluster.name option in elasticsearch.yml) to let nodes join with each other.
The best practice is to have all the nodes installed with the same Elasticsearch version (major.minor.release). This suggestion is also valid for third-party plugins.
  1. To customize the network preferences, you need to change some parameters in the elasticsearch.yml file, as follows:
cluster.name: ESCookBook
node.name: "Node1"
network.host: 192.168.1.164
discovery.zen.ping.unicast.hosts: ["192.168.1.164","192.168.1.165[9300-9400]"]
  1. This configuration sets the cluster name to Elasticsearch, the node name, the network address, and it tries to bind the node to the address given in the discovery section by performing the following tasks:
    • We can check the configuration during node loading
    • We can now start the server and check whether the networking is configured, as follows:
    [2018-10-28T17:42:16,386][INFO ][o.e.c.s.MasterService ] [Node1] zen-disco-elected-as-master ([0] nodes joined)[, ], reason: new_master {Node1}{fyBySLMcR3uqKiYC32P5Sg}{IX1wpA01QSKkruZeSRPlFg}{192.168.1.164}{192.168.1.164:9300}{ml.machine_memory=17179869184, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}
    [2018-10-28T17:42:16,390][INFO ][o.e.c.s.ClusterApplierService] [Node1] new_master {Node1}{fyBySLMcR3uqKiYC32P5Sg}{IX1wpA01QSKkruZeSRPlFg}{192.168.1.164}{192.168.1.164:9300}{ml.machine_memory=17179869184, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}, reason: apply cluster state (from master [master {Node1}{fyBySLMcR3uqKiYC32P5Sg}{IX1wpA01QSKkruZeSRPlFg}{192.168.1.164}{192.168.1.164:9300}{ml.machine_memory=17179869184, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)[, ]]])
    [2018-10-28T17:42:16,403][INFO ][o.e.x.s.t.n.SecurityNetty4HttpServerTransport] [Node1] publish_address {192.168.1.164:9200}, bound_addresses {192.168.1.164:9200}
    [2018-10-28T17:42:16,403][INFO ][o.e.n.Node ] [Node1] started
    [2018-10-28T17:42:16,600][INFO ][o.e.l.LicenseService ] [Node1] license [b2754b17-a4ec-47e4-9175-4b2e0d714a45] mode [basic] - valid

    As you can see from my screen dump, the transport is bound to 192.168.1.164:9300. The REST HTTP interface is bound to 192.168.1.164:9200.

    How it works…

    The following are the main important configuration keys for networking management:

    • cluster.name: This sets up the name of the cluster. Only nodes with the same name can join together.
    • node.name: If not defined, this is automatically assigned by Elasticsearch.

    node.name allows defining a name for the node. If you have a lot of nodes on different machines, it is useful to set their names to something meaningful in order to easily locate them. Using a valid name is easier to remember than a generated name such as fyBySLMcR3uqKiYC32P5Sg.

    You must always set up a node.name if you need to monitor your server. Generally, a node name is the same as a host server name for easy maintenance.

    network.host defines the IP of your machine to be used to bind the node. If your server is on different LANs, or you want to limit the bind on only one LAN, you must set this value with your server IP.

    discovery.zen.ping.unicast.hosts allows you to define a list of hosts (with ports or a port range) to be used to discover other nodes to join the cluster. The preferred port is the transport one, usually 9300.

    The addresses of the hosts list can be a mix of the following:

    • Hostname, that is, myhost1
    • IP address, that is, 192.168.1.12
    • IP address or hostname with the port, that is, myhost1:9300, 192.168.168.1.2:9300
    • IP address or hostname with a range of ports, that is, myhost1:[9300-9400], 192.168.168.1.2:[9300-9400]

    See also

    The Setting up a node recipe in this chapter