Book Image

Mastering Hadoop

By : Karanth
Book Image

Mastering Hadoop

By: Karanth

Overview of this book

Do you want to broaden your Hadoop skill set and take your knowledge to the next level? Do you wish to enhance your knowledge of Hadoop to solve challenging data processing problems? Are your Hadoop jobs, Pig scripts, or Hive queries not working as fast as you intend? Are you looking to understand the benefits of upgrading Hadoop? If the answer is yes to any of these, this book is for you. It assumes novice-level familiarity with Hadoop.
Table of Contents (15 chapters)
14
Index

Pig versus SQL


SQL is a very popular query and data processing language. Any high-level language for data processing deserves comparison with SQL. In this section, we will compare Pig Latin with SQL. The comparison is as follows:

  • Pig Latin is primarily a procedural language. SQL, on the other hand, is declarative in nature. The data pipeline in SQL is not expressed as the data transformations happen. However, in Pig Latin, each step of the data transformation in the pipeline is specified in order. It is possible to mimic this behavior in SQL with the use of intermediate temporary tables, but creating, managing, and cleaning up these intermediate tables can be cumbersome and error-prone. Though Pig Latin scripts are specified procedurally, the statements are executed lazily, that is, they are not executed until the value is absolutely required.

  • Developers writing data flows in a declarative language such as SQL overly depend on the query optimizer to choose the right implementation for the...