Book Image

Apache Hive Essentials. - Second Edition

By : Dayong Du
Book Image

Apache Hive Essentials. - Second Edition

By: Dayong Du

Overview of this book

In this book, we prepare you for your journey into big data by frstly introducing you to backgrounds in the big data domain, alongwith the process of setting up and getting familiar with your Hive working environment. Next, the book guides you through discovering and transforming the values of big data with the help of examples. It also hones your skills in using the Hive language in an effcient manner. Toward the end, the book focuses on advanced topics, such as performance, security, and extensions in Hive, which will guide you on exciting adventures on this worthwhile big data journey. By the end of the book, you will be familiar with Hive and able to work effeciently to find solutions to big data problems
Table of Contents (12 chapters)

Data exchange with [EX|IM]PORT

When working on data migration or release deployment, we may need to move data between different environments or clusters. In HQL, EXPORT and IMPORT statements are available to move data between HDFS in different environments or clusters. The EXPORT statement exports both data and metadata from a table or partition. Metadata is exported in a file called _metadata. Data is exported in a subdirectory called data, as follows:

> EXPORT TABLE employee TO '/tmp/output5';
No rows affected (0.19 seconds)

> dfs -ls -R /tmp/output5/;
+--------------------------------+
| DFS Output |
+--------------------------------+
| ... /tmp/output5/_metadata |
| ... /tmp/output5/data |
| ... /tmp/output5/data/000000_0 |
+--------------------------------+
3 rows selected (0.014 seconds)
For EXPORT, the database name can be used...