Book Image

Solr Cookbook - Third Edition

By : Rafal Kuc
Book Image

Solr Cookbook - Third Edition

By: Rafal Kuc

Overview of this book

Table of Contents (18 chapters)
Solr Cookbook Third Edition
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

Installing ZooKeeper for SolrCloud


You might know that in order to run SolrCloud, the distributed Solr deployment, you need to have Apache ZooKeeper installed. Zookeeper is a centralized service for maintaining configurations, naming, and provisioning service synchronizations. SolrCloud uses ZooKeeper to synchronize configurations and cluster states to help with leader election and so on. This is why it is crucial to have a highly available and fault-tolerant ZooKeeper installation. If you have a single ZooKeeper instance, and it fails, then your SolrCloud cluster will crash too. So, this recipe will show you how to install ZooKeeper so that it's not a single point of failure in your cluster configuration.

Getting ready

The installation instructions in this recipe contain information about installing ZooKeeper Version 3.4.6, but it should be useable for any minor release changes of Apache ZooKeeper. To download ZooKeeper, visit http://zookeeper.apache.org/releases.html. This recipe will show you how to install ZooKeeper in a Linux-based environment. For ZooKeeper to work, Java needs to be installed.

How to do it...

Let's assume that we have decided to install ZooKeeper in the /usr/share/zookeeper directory of our server, and we want to have three servers (with IPs 192.168.1.1, 192.168.1.2, and 192.168.1.3) hosting a distributed ZooKeeper installation. This can be done by performing the following steps:

  1. After downloading the ZooKeeper installation, we create the necessary directory:

    sudo mkdir /usr/share/zookeeper 
    
  2. Then, we unpack the downloaded archive to the newly created directory. We do this on three servers.

  3. Next, we need to change our ZooKeeper configuration file and specify the servers that will form a ZooKeeper quorum. So, we edit the /usr/share/zookeeper/conf/zoo.cfg file and add the following entries:

    clientPort=2181
    dataDir=/usr/share/zookeeper/data
    tickTime=2000
    initLimit=10
    syncLimit=5
    server.1=192.168.1.1:2888:3888
    server.2=192.168.1.2:2888:3888
    server.3=192.168.1.3:2888:3888
  4. Now, the next thing we need to do is create a file called myid in the /usr/share/zookeeper/data directory. The file should contain a single number that corresponds to the server number. For example, if ZooKeeper is located on 192.168.1.1, it will be 1, and if ZooKeeper is located on 192.168.1.3, it will be 3, and so on.

  5. Now, we can start the ZooKeeper servers with the following command:

    /usr/share/zookeeper/bin/zkServer.sh start
    
  6. If everything goes well, you should see something like:

    JMX enabled by default
    Using config: /usr/share/zookeeper/bin/../conf/zoo.cfg
    Starting zookeeper ... STARTED
    

That's all. Of course, you can also add the ZooKeeper service to start automatically as your operating system starts up, but this is beyond the scope of the recipe and book.

How it works...

I talked about the ZooKeeper quorum and started this using three ZooKeeper nodes. ZooKeeper operates in a quorum, which means that at least 50 percent plus one server needs to be available and connected. We can start with a single ZooKeeper server, but such deployment won't be highly available and resistant to failures. So, to be able to handle at least a single ZooKeeper node failure, we need at least three ZooKeeper nodes running.

Let's skip the first part because creating the directory and unpacking the ZooKeeper server is quite simple. What I would like to concentrate on are the configuration values of the ZooKeeper server. The clientPort property specifies the port on which our SolrCloud servers should connect to ZooKeeper. The dataDir property specifies the directory where ZooKeeper will hold its data. Note that ZooKeeper needs read and write permissions to the directory. So far so good, right? So, now, the more advanced properties, such as tickTime, specified in milliseconds is the basic time unit for ZooKeeper. The initLimit property specifies how many ticks the initial synchronization phase can take. Finally, syncLimit specifies how many ticks can pass between sending the request and receiving an acknowledgement.

There are also three additional properties present, server.1, server.2, and server.3. These three properties define the addresses of the ZooKeeper instances that will form the quorum. The values for each of these properties are separated by a colon character. The first part is the IP address of the ZooKeeper server, and the second and third parts are the ports used by ZooKeeper instances to communicate with each other.

The last thing is the myid file located in the /usr/share/zookeeper/data directory. The contents of the file is used by ZooKeeper to identify itself. This is why we need to properly configure it so that ZooKeeper is not confused. So, for the ZooKeeper server specified as server.1, we need to write 1 to the myid file.

Tip

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.