So far in the book, we have covered how to utilize the capabilities of ES-Hadoop to make sense of the data in HDFS or from live streaming data. We know how to get data in and out of Elasticsearch and how to execute complex queries on it. However, we didn't need to explore the details of setting up clusters, shards, or replicas. This is how Elasticsearch was intended to be. Elasticsearch makes it so easy to get started and has defaults that make lot of sense in almost all situations. You don't really need to go into detailed configuration if you are not deploying in the production environment.
This chapter will touch important concepts, configurations, and guidelines for Elasticsearch and ES-Hadoop that are essential to know before designing your strategy for production. We will discuss the following topics in this chapter.
Elasticsearch in a distributed environment
The Elasticsearch-Hadoop architecture
Configuring the environment for production
Administration...