Book Image

Mastering Hadoop

By : Sandeep Karanth
Book Image

Mastering Hadoop

By: Sandeep Karanth

Overview of this book

Table of Contents (21 chapters)
Mastering Hadoop
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

User-defined functions


User-defined functions or UDFs, are functions that can be implemented by the developer to extend the functionality of Pig and add custom processing. These functions can be called in almost all Pig operators. UDFs are written in Java. From Pig 0.8 onwards, Python UDFs are supported. In the latest version of Pig, in addition to Python and Java, UDFs can be written in Jython, JavaScript, Ruby, and Groovy.

Other than Java, the rest of the language bindings do not support all interfaces of Pig. For example, the load and store interfaces are not supported by the other language bindings. In this book, we will use Java to build and illustrate the power of UDFs.

There is a repository of Java UDFs called piggy bank. This is a public repository where you can take advantage of UDFs written by others and contribute your own UDFs to the community.

Before using a UDF in Pig, it is necessary to register the JAR file in the Pig script. The registration is done using the REGISTER command...