Book Image

Elasticsearch Server - Third Edition

By : Rafal Kuc
Book Image

Elasticsearch Server - Third Edition

By: Rafal Kuc

Overview of this book

ElasticSearch is a very fast and scalable open source search engine, designed with distribution and cloud in mind, complete with all the goodies that Apache Lucene has to offer. ElasticSearch’s schema-free architecture allows developers to index and search unstructured content, making it perfectly suited for both small projects and large big data warehouses, even those with petabytes of unstructured data. This book will guide you through the world of the most commonly used ElasticSearch server functionalities. You’ll start off by getting an understanding of the basics of ElasticSearch and its data indexing functionality. Next, you will see the querying capabilities of ElasticSearch, followed by a through explanation of scoring and search relevance. After this, you will explore the aggregation and data analysis capabilities of ElasticSearch and will learn how cluster administration and scaling can be used to boost your application performance. You’ll find out how to use the friendly REST APIs and how to tune ElasticSearch to make the most of it. By the end of this book, you will have be able to create amazing search solutions as per your project’s specifications.
Table of Contents (18 chapters)
Elasticsearch Server Third Edition
Credits
About the Authors
About the Reviewer
www.PacktPub.com
Preface
Index

Preface

Welcome to Elasticsearch Server, Third Edition. This is the third instalment of the book dedicated to yet another major release of Elasticsearch—this time version 2.2. In the third edition, we have decided to go on a similar route that we took when we wrote the second edition of the book. We not only updated the content to match the new version of Elasticsearch, but also restructured the book by removing and adding new sections and chapters. We read the suggestions we got from you—the readers of the book, and we carefully tried to incorporate the suggestions and comments received since the release of the first and second editions.

While reading this book, you will be taken on a journey to the wonderful world of full-text search provided by the Elasticsearch server. We will start with a general introduction to Elasticsearch, which covers how to start and run Elasticsearch, its basic concepts, and how to index and search your data in the most basic way. This book will also discuss the query language, so called Query DSL, that allows you to create complicated queries and filter returned results. In addition to all of this, you'll see how you can use the aggregation framework to calculate aggregated data based on the results returned by your queries. We will implement the autocomplete functionality together and learn how to use Elasticsearch spatial capabilities and prospective search.

Finally, this book will show you Elasticsearch's administration API capabilities with features such as shard placement control, cluster handling, and more, ending with a dedicated chapter that will discuss Elasticsearch's preparation for small and large deployments— both ones that concentrate on indexing and also ones that concentrate on indexing.

What this book covers

Chapter 1, Getting Started with Elasticsearch Cluster, covers what full-text searching is, what Apache Lucene is, what text analysis is, how to run and configure Elasticsearch, and finally, how to index and search your data in the most basic way.

Chapter 2, Indexing Your Data, shows how indexing works, how to prepare index structure, what data types we are allowed to use, how to speed up indexing, what segments are, how merging works, and what routing is.

Chapter 3, Searching Your Data, introduces the full-text search capabilities of Elasticsearch by discussing how to query it, how the querying process works, and what types of basic and compound queries are available. In addition to this, we will show how to use position-aware queries in Elasticsearch.

Chapter 4, Extending Your Query Knowledge, shows how to efficiently narrow down your search results by using filters, how highlighting works, how to sort your results, and how query rewrite works.

Chapter 5, Extending Your Index Structure, shows how to index more complex data structures. We learn how to index tree-like data types, how to index data with relationships between documents, and how to modify index structure.

Chapter 6, Make Your Search Better, covers Apache Lucene scoring and how to influence it in Elasticsearch, the scripting capabilities of Elasticsearch, and its language analysis capabilities.

Chapter 7, Aggregations for Data Analysis, introduces you to the great world of data analysis by showing you how to use the Elasticsearch aggregation framework. We will discuss all types of aggregations—metrics, buckets, and the new pipeline aggregations that have been introduced in Elasticsearch.

Chapter 8, Beyond Full-text Searching, discusses non full-text search-related functionalities such as percolator—reversed search, and the geo-spatial capabilities of Elasticsearch. This chapter also discusses suggesters, which allow us to build a spellchecking functionality and an efficient autocomplete mechanism, and we will show how to handle deep-paging efficiently.

Chapter 9, Elasticsearch Cluster in Detail, discusses nodes discovery mechanism, recovery and gateway Elasticsearch modules, templates, caches, and settings update API.

Chapter 10, Administrating Your Cluster, covers the Elasticsearch backup functionality, rebalancing, and shards moving. In addition to this, you will learn how to use the warm up functionality, use the Cat API, and work with aliases.

Chapter 11, Scaling by Example, is dedicated to scaling and tuning. We will start with hardware preparations and considerations and a single Elasticsearch node-related tuning. We will go through cluster setup and vertical scaling, ending the chapter with high querying and indexing use cases and cluster monitoring.

What you need for this book

This book was written using Elasticsearch server 2.2 and all the examples and functions should work with this. In addition to this, you'll need a command that allows you to send HTTP request such as curl, which is available for most operating systems. Please note that all the examples in this book use the previously mentioned curl tool. If you want to use another tool, please remember to format the request in an appropriate way that is understood by the tool of your choice.

In addition to this, some chapters may require additional software, such as Elasticsearch plugins, but when needed it has been explicitly mentioned.

Who this book is for

If you are a beginner to the world of full-text search and Elasticsearch, then this book is especially for you. You will be guided through the basics of Elasticsearch and you will learn how to use some of the advanced functionalities.

If you know Elasticsearch and you worked with it, then you may find this book interesting as it provides a nice overview of all the functionalities with examples and descriptions. However, you may encounter sections that you already know.

If you know the Apache Solr search engine, this book can also be used to compare some functionalities of Apache Solr and Elasticsearch. This may give you the knowledge about which tool is more appropriate for your use case.

If you know all the details about Elasticsearch and you know how each of the configuration parameters work, then this is definitely not the book you are looking for.

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "If you use the Linux or OS X command, the cURL package should already be available."

A block of code is set as follows:

{
  "mappings": {
    "post": {
      "properties": {                
        "id": { "type":"long" },
        "name": { "type":"string" },
        "published": { "type":"date" },
        "contents": { "type":"string" }             
      }
    }
  }
}

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

{
  "mappings": {
    "post": {
      "properties": {                
        "id": { "type":"long" },
        "name": { "type":"string" },
        "published": { "type":"date" },
        "contents": { "type":"string" }             
      }
    }
  }
}

Any command-line input or output is written as follows:

curl -XPUT http://localhost:9200/users/?pretty -d '{ 
  "mappings" : {
    "user": {
      "numeric_detection" : true
    }
  }
}'

Note

Warnings or important notes appear in a box like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail , and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for this book from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

You can download the code files by following these steps:

  1. Log in or register to our website using your e-mail address and password.

  2. Hover the mouse pointer on the SUPPORT tab at the top.

  3. Click on Code Downloads & Errata.

  4. Enter the name of the book in the Search box.

  5. Select the book for which you're looking to download the code files.

  6. Choose from the drop-down menu where you purchased this book from.

  7. Click on Code Download.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

  • WinRAR / 7-Zip for Windows

  • Zipeg / iZip / UnRarX for Mac

  • 7-Zip / PeaZip for Linux

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/ElasticsearchServerThirdEdition_ColorImages.pdf.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at , and we will do our best to address the problem.