Distributed search is considered an option when search with single index store becomes difficult to operate in terms of speed and sizing. There are two major operations that take place in any search engine; first is indexing the data, and second is searching.
When running search on the distributed systems, any or all of the operations can be run in a distributed manner depending upon why you wish to run your search in a distributed environment. Let's look at the architecture for distributed search in the following diagram:
To utilize the distributed search, the indexing must be split into multiple shards and should be kept across multiple nodes of a distributed system. In order to generate this index, a search application may use a distributed system such as Apache Hadoop, and then based on the generated index, it may push it to the search engine repository.
Similar to the distributed index generation, even search...