Elasticsearch Server - Third Edition

Elasticsearch Server - Third Edition

By : Rafal Kuc

Buy this Book

Elasticsearch Server - Third Edition

By: Rafal Kuc

Buy this Book

Overview of this book

ElasticSearch is a very fast and scalable open source search engine, designed with distribution and cloud in mind, complete with all the goodies that Apache Lucene has to offer. ElasticSearch’s schema-free architecture allows developers to index and search unstructured content, making it perfectly suited for both small projects and large big data warehouses, even those with petabytes of unstructured data. This book will guide you through the world of the most commonly used ElasticSearch server functionalities. You’ll start off by getting an understanding of the basics of ElasticSearch and its data indexing functionality. Next, you will see the querying capabilities of ElasticSearch, followed by a through explanation of scoring and search relevance. After this, you will explore the aggregation and data analysis capabilities of ElasticSearch and will learn how cluster administration and scaling can be used to boost your application performance. You’ll find out how to use the friendly REST APIs and how to tune ElasticSearch to make the most of it. By the end of this book, you will have be able to create amazing search solutions as per your project’s specifications.

Elasticsearch Server Third Edition

Credits

About the Authors

About the Reviewer

www.PacktPub.com

Preface

Free Chapter

Getting Started with Elasticsearch Cluster

Full text searching

The basics of Elasticsearch

Installing and configuring your cluster

Manipulating data with the REST API

Searching with the URI request query

Summary

Indexing Your Data

Elasticsearch indexing

Mappings configuration

Batch indexing to speed up your indexing process

Introduction to segment merging

Introduction to routing

Summary

Searching Your Data

Querying Elasticsearch

Understanding the querying process

Basic queries

Compound queries

Using span queries

Choosing the right query

Summary

Extending Your Querying Knowledge

Filtering your results

Highlighting

Validating your queries

Sorting data

Query rewrite

Summary

Extending Your Index Structure

Indexing tree-like structures

Indexing data that is not flat

Using nested objects

Using the parent-child relationship

Modifying your index structure with the update API

Summary

Make Your Search Better

Introduction to Apache Lucene scoring

Scripting capabilities of Elasticsearch

Searching content in different languages

Influencing scores with query boosts

When does index-time boosting make sense?

Words with the same meaning

Understanding the explain information

Summary

Aggregations for Data Analysis

Aggregations

Aggregation types

Pipeline aggregations

Summary

Beyond Full-text Searching

Percolator

Elasticsearch spatial capabilities

Using suggesters

The Scroll API

Summary

Elasticsearch Cluster in Detail

Understanding node discovery

The gateway and recovery modules

Templates and dynamic templates

Elasticsearch plugins

Elasticsearch caches

The update settings API

Summary

Administrating Your Cluster

Elasticsearch time machine

Monitoring your cluster's state and health

Controlling the shard and replica allocation

Controlling cluster rebalancing

The Cat API

Warming up

Index aliasing and using it to simplify your everyday work

Summary

Scaling by Example

Hardware

Preparing a single Elasticsearch node

Horizontal expansion

Preparing the cluster for high indexing and querying throughput

Monitoring

Summary

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Introduction to routing

By default, Elasticsearch will try to distribute your documents evenly among all the shards of the index. However, that's not always the desired situation. In order to retrieve the documents, Elasticsearch must query all the shards and merge the results. What if we could divide our data on some basis (for example, the client identifier) and use that information to put data with the same properties in the same place in the cluster. Elasticsearch allows us to do that by exposing a powerful document and query distribution control mechanism routing. In short, it allows us to choose a shard to be used to index or search the data.

Default indexing

During indexing operations, when you send a document for indexing, Elasticsearch looks at its identifier to choose the shard in which the document should be indexed. By default, Elasticsearch calculates the hash value of the document's identifier and, on the basis of that, it puts the document in one of the available primary shards...

Elasticsearch Server - Third Edition

By : Rafal Kuc

Elasticsearch Server - Third Edition

By: Rafal Kuc

Overview of this book

Related Content you might be interested in

Current Title:

Elasticsearch Server - Third Edition

Introduction to routing

Default indexing