It is a common scenario for a Scalding job to process files from HDFS and join them with data fetched from a SQL database. Similarly, we will often have to implement a MapReduce job that writes some results into a SQL database.
For SQL, and in the context of MapReduce, we are interested to have support for all access patterns, many SQL dialects, and also batch capabilities. Batching is the technique of aggregating multiple, possibly hundreds of SQL statements and executing them as a single batch command into the database system.
The latter is very important as a MapReduce application can easily scale to hundreds of Java virtual machines, running the map and reduce tasks. Having hundreds of nodes trying to communicate with a database system at the same time can stress the system to its limits.