Book Image

Mastering Hadoop

By : Karanth
Book Image

Mastering Hadoop

By: Karanth

Overview of this book

Do you want to broaden your Hadoop skill set and take your knowledge to the next level? Do you wish to enhance your knowledge of Hadoop to solve challenging data processing problems? Are your Hadoop jobs, Pig scripts, or Hive queries not working as fast as you intend? Are you looking to understand the benefits of upgrading Hadoop? If the answer is yes to any of these, this book is for you. It assumes novice-level familiarity with Hadoop.
Table of Contents (21 chapters)
Mastering Hadoop
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

Complex data types in Pig


Pig has primitive data types such as int, long, float, double, chararray, and bytearray. In addition, Pig also supports complex data types. Inputs and outputs to Pig's relational operators are specified using these complex data types. In some cases, the behavior of the operators depends on the complex data type used. These complex data types are as follows:

  • Map: This data type should not be confused with the map function of MapReduce. The Map data type is an associative array data type that stores a chararray key and its associated value. There is no restriction on the data type of the value in a map. It can be a complex type too. If the type of the value cannot be determined, Pig defaults to the bytearray data type. The key and value association is syntactically done via the # symbol. The key values within a map have to be unique:

    [key#value, key1#value1…]

  • Tuple: A Tuple data type is a collection of data values. They are of fixed length and are ordered. They can...