Book Image

Learning Hadoop 2

Book Image

Learning Hadoop 2

Overview of this book

Table of Contents (18 chapters)
Learning Hadoop 2
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Preface
Index

Programming Pig


Pig Latin comes with a number of built-in functions (the eval, load/store, math, string, bag, and tuple functions) and a number of scalar and complex data types. Additionally, Pig allows function and data-type extension by means of UDFs and dynamic invocation of Java methods.

Pig data types

Pig supports the following scalar data types:

  • int: a signed 32-bit integer

  • long: a signed 64-bit integer

  • float: a 32-bit floating point

  • double: a 64-bit floating point

  • chararray: a character array (string) in Unicode UTF-8 format

  • bytearray: a byte array (blob)

  • boolean: a boolean

  • datetime: a datetime

  • biginteger: a Java BigInteger

  • bigdecimal: a Java BigDecimal

Pig supports the following complex data types:

  • map: an associative array enclosed by [], with the key and value separated by #, and items separated by ,

  • tuple: an ordered list of data, where elements can be of any scalar or complex type enclosed by (), with items separated by ,

  • bag: an unordered collection of tuples enclosed by {} and...