Book Image

Elasticsearch Essentials

By : Bharvi Dixit
Book Image

Elasticsearch Essentials

By: Bharvi Dixit

Overview of this book

With constantly evolving and growing datasets, organizations have the need to find actionable insights for their business. ElasticSearch, which is the world's most advanced search and analytics engine, brings the ability to make massive amounts of data usable in a matter of milliseconds. It not only gives you the power to build blazing fast search solutions over a massive amount of data, but can also serve as a NoSQL data store. This guide will take you on a tour to become a competent developer quickly with a solid knowledge level and understanding of the ElasticSearch core concepts. Starting from the beginning, this book will cover these core concepts, setting up ElasticSearch and various plugins, working with analyzers, and creating mappings. This book provides complete coverage of working with ElasticSearch using Python and performing CRUD operations and aggregation-based analytics, handling document relationships in the NoSQL world, working with geospatial data, and taking data backups. Finally, we’ll show you how to set up and scale ElasticSearch clusters in production environments as well as providing some best practices.
Table of Contents (12 chapters)
11
Index

Creating a search database


It's always good to have some practical examples with real data sets, and what could be better than real-time social media data? In this section, we will write the code that will fetch tweets from Twitter in real time based on the search keywords provided. There are three dependencies of the code written in this section:

  • tweepy is a Python client for Twitter.

  • elasticsearch is a Python client for Elasticsearch that we have already installed.

  • For Twitter API access token keys, please follow the instructions at this link. https://dev.twitter.com/oauth/overview/application-owner-access-tokens, to create a sample Twitter application and get all the four keys that are needed to interact with the Twitter API. These four tokens are named: Access Token, Access Token Secret, Consumer Key, and Consumer Secret.

After generating the auth tokens and keys stored it inside config.py with the variable names: consumer_key, consumer_secret, access_token, and access_token_secret. The...