Sign In Start Free Trial

Book Overview & Buying
Table Of Contents

Advanced Elasticsearch 7.0

By : Wai Tak Wong

3.5 (4)

Advanced Elasticsearch 7.0

3.5 (4)

By: Wai Tak Wong

Overview of this book

Building enterprise-grade distributed applications and executing systematic search operations call for a strong understanding of Elasticsearch and expertise in using its core APIs and latest features. This book will help you master the advanced functionalities of Elasticsearch and understand how you can develop a sophisticated, real-time search engine confidently. In addition to this, you'll also learn to run machine learning jobs in Elasticsearch to speed up routine tasks. You'll get started by learning to use Elasticsearch features on Hadoop and Spark and make search results faster, thereby improving the speed of query results and enhancing the customer experience. You'll then get up to speed with performing analytics by building a metrics pipeline, defining queries, and using Kibana for intuitive visualizations that help provide decision-makers with better insights. The book will later guide you through using Logstash with examples to collect, parse, and enrich logs before indexing them in Elasticsearch. By the end of this book, you will have comprehensive knowledge of advanced topics such as Apache Spark support, machine learning using Elasticsearch and scikit-learn, and real-time analytics, along with the expertise you need to increase business productivity, perform analytics, and get the very best out of Elasticsearch.

Preface

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Free Chapter

Section 1: Fundamentals and Core APIs

Section 1: Fundamentals and Core APIs

Overview of Elasticsearch 7

Overview of Elasticsearch 7

Preparing your environment

Running Elasticsearch

Talking to Elasticsearch

Elasticsearch architectural overview

Key concepts

API conventions

New features

Breaking changes

Migration between versions

Summary

Index APIs

Index APIs

Index management APIs

Index settings

Index aliases

Monitoring indices

Index persistence

Advanced index management APIs

Summary

Document APIs

Document APIs

The Elasticsearch document life cycle

Single document management APIs

Multi-document management APIs

Migration from a multiple mapping types index

Summary

Mapping APIs

Mapping APIs

Dynamic mapping

Meta fields in mapping

Field datatypes

Mapping parameters

Refreshing mapping changes for static mapping

Typeless APIs working with old custom index types

Summary

Anatomy of an Analyzer

Anatomy of an Analyzer

An analyzer's components

Character filters

Tokenizers

Token filters

Built-in analyzers

Custom analyzers

Normalizers

Summary

Search APIs

Search APIs

Indexing sample documents

Search APIs

Query DSL

The multi-search API

Other search-related APIs

Summary

Section 2: Data Modeling, Aggregations Framework, Pipeline, and Data Analytics

Section 2: Data Modeling, Aggregations Framework, Pipeline, and Data Analytics

Modeling Your Data in the Real World

Modeling Your Data in the Real World

The Investor Exchange Cloud

Modeling data and the approaches

Practical considerations

Summary

Aggregation Frameworks

Aggregation Frameworks

ETF historical data preparation

Aggregation query syntax

Matrix aggregations

Metrics aggregations

Bucket aggregations

Pipeline aggregations

Post filter on aggregations

Summary

Preprocessing Documents in Ingest Pipelines

Preprocessing Documents in Ingest Pipelines

Ingest APIs

Accessing data in pipelines

Processors

Conditional execution in pipelines

Handling failures in pipelines

Summary

Using Elasticsearch for Exploratory Data Analysis

Using Elasticsearch for Exploratory Data Analysis

Business analytics

Operational data analytics

Sentiment analysis

Summary

Section 3: Programming with the Elasticsearch Client

Section 3: Programming with the Elasticsearch Client

Elasticsearch from Java Programming

Elasticsearch from Java Programming

Overview of Elasticsearch Java REST client

The Java low-level REST client

The Java high-level REST client

Spring Data Elasticsearch

Summary

Elasticsearch from Python Programming

Elasticsearch from Python Programming

Overview of the Elasticsearch Python client

The Python low-level Elasticsearch client

The Python high-level Elasticsearch library

Summary

Section 4: Elastic Stack

Section 4: Elastic Stack

Using Kibana, Logstash, and Beats

Using Kibana, Logstash, and Beats

Overview of the Elastic Stack

Running Elasticsearch in a Docker container

Running Kibana in a Docker container

Running Logstash in a Docker container

Running Beats in a Docker container

Summary

Working with Elasticsearch SQL

Working with Elasticsearch SQL

Overview

Getting started

Elasticsearch SQL language

Elasticsearch SQL REST API

Elasticsearch SQL JDBC

Summary

Working with Elasticsearch Analysis Plugins

Working with Elasticsearch Analysis Plugins

What are Elasticsearch plugins?

Working with the ICU Analysis plugin

Working with the Smart Chinese Analysis plugin

Working with the IK Analysis plugin

Summary

Section 5: Advanced Features

Section 5: Advanced Features

Machine Learning with Elasticsearch

Machine Learning with Elasticsearch

Machine learning with Elastic Stack

Machine learning using Elasticsearch and scikit-learn

Summary

Spark and Elasticsearch for Real-Time Analytics

Spark and Elasticsearch for Real-Time Analytics

Overview of ES-Hadoop

Apache Spark support

Real-time analytics using Elasticsearch and Apache Spark

Summary

Building Analytics RESTful Services

Building Analytics RESTful Services

Building a RESTful web service with Spring Boot

Integration with the Bollinger Band

Building a Java Spark ML module for k-means anomaly detection

Testing Analytics RESTful services

Working with Kibana to visualize the analytics results

Summary

Other Books You May Enjoy

Other Books You May Enjoy

Leave a review - let other readers know what you think

Practical considerations

For join datatypes, the parent allows re-indexing/adding/deleting specific children. However, using has_child or has_parent queries can have a significant impact on performance. If you need better performance, always use nested datatypes. Nonetheless, as long as you have to update, you need to re-index all children to their parent. The nested datatype approach is also easier to manage than the join datatype approach. You must be very careful while using the join datatype method because you can index children without a parent. Also, if you want to remove a parent, it is not an automatic cascading task to delete all of its children. You need to clean it up by yourself. On the other hand, if you want to update parent or child document, the join datatypes approach will be more convenient because you can update the values in the parent field or the child field...

CONTINUE READING

83

Tech Concepts

36

Programming languages

73

Tech Tools

Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.

Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.

50+ new titles added per month and exclusive early access to books as they are being written.

Advanced Elasticsearch 7.0

Search

Your notes and bookmarks