Hive is a Hadoop-based data warehousing-like framework developed by Facebook. It allows users to fire queries in SQL, with languages like HiveQL, which are highly abstracted to Hadoop MapReduce. This allows SQL programmers with no MapReduce experience to use the warehouse and makes it easier to integrate with business intelligence and visualization tools for real-time query processing.
The following are the features of Hive:
Hibernate Query Language (HQL)
Supports UDF
Metadata storage
Data indexing
Different storage type
Hadoop integration
Prerequisites for RHive are as follows:
Hadoop
Hive
We assume here that our readers have already configured Hadoop; else they can learn Hadoop installation from Chapter 1, Getting Ready to Use R and Hadoop. As Hive will be required for running RHive, we will first see how Hive can be installed.