Book Image

Scaling Big Data with Hadoop and Solr, Second Edition

By : Hrishikesh Vijay Karambelkar
Book Image

Scaling Big Data with Hadoop and Solr, Second Edition

By: Hrishikesh Vijay Karambelkar

Overview of this book

Table of Contents (13 chapters)
Scaling Big Data with Hadoop and Solr Second Edition
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Big data search using Katta


Katta provides highly scalable, fault-tolerant information storage. It is an open source project and uses the underlying Hadoop infrastructure (to be specific, HDFS) for storing its indices and providing access to them. Katta has been in the market for the last few years and while recently, the development on Katta has been stalled, there are still many users who go with Solr-Katta-based integration for big data search. Some organizations customize Katta as per their needs and utilize its capabilities for highly scalable search. Katta brings Apache Hadoop and Solr together, bringing search across a completely distributed MapReduce-based cluster. You can read more information about Katta on http://katta.sourceforge.net/.

How Katta works?

Katta can be primarily used with two different functions. The first is generating the Solr index, and the second is by running a search on the Hadoop cluster. The following diagram depicts what the Katta architecture looks like:

The...