In this chapter, we will cover the following topics:
Getting started with Apache Pig
Joining two datasets using Pig
Accessing a Hive table data in Pig using HCatalog
Getting started with Apache HBase
Data random access using Java client APIs
Running MapReduce jobs on HBase
Using Hive to insert data into HBase tables
Getting started with Apache Mahout
Running K-means with Mahout
Importing data to HDFS from a relational database using Apache Sqoop
Exporting data from HDFS to a relational database using Apache Sqoop