Let's have a twist in the rainfall use case we solved in the previous chapter. Instead of getting CSV files for rainfall data, we need to import the rainfall data from MySQL database and then move on to processing.
As the first step of the analysis, we need to bring data inside Hadoop using Sqoop. To do this, we will use Sqoop import at end of each day to get data on Hadoop, and then we will run our Pig script for processing and saving results to Hive.
Just like previous chapters, we will start with the command-line option to trigger jobs, and we will learn about Sqoop action and scheduling it via Coordinator. Lastly, we will cover the concept of HCatalog Datasets. Let's get started.