As the data grows, it impacts your time taken for search, as well as to create new indexes along with the size of the repository. The simplest way to preserve the same performance of the search while scaling your data is to keep increasing your hardware, which includes higher processing power and higher memory size. This is not a cost-effective alternative. So, we look for optimizing the running of Big Data search instance. We have also seen different architectures of Solr in Chapter 4, Using Big Data to Build Your Large Indexing, among which the most suitable architecture can be chosen based on the requirements and the usage patterns.
Overall optimization of the technology stack which includes Apache Hadoop and Apache Solr helps you maintain more data with reasonable performance. The optimization is most important while scaling your instance for Big Data with Hadoop and Solr. We are going to look at different techniques...