Hive is a framework that sits on top of core Hadoop. It acts as a data warehousing system on top of HDFS and provides easy query mechanisms to the underlying HDFS data. Programming MapReduce jobs could be tedious and will have their own development, testing, and maintenance investments. Hive queries, called Hive Query Language (HQL) are broken down into MapReduce jobs under the hood and remain a complete abstraction to the user and provide a query-based access mechanism for Hadoop data. The simplicity and "SQL" – ness of the Hive queries have made it a popular and preferred choice for users, particularly, people familiar with traditional SQL skills love it since the ramp up time is much less. The following figure gives an overview of the Hive architecture:

In effect, Hive enables you to create an interface layer over MapReduce that can be used in a similar fashion to a traditional relational database; enabling business users to use familiar tools such...