Book Image

Pig Design Patterns

By : Pradeep Pasupuleti
Book Image

Pig Design Patterns

By: Pradeep Pasupuleti

Overview of this book

Table of Contents (16 chapters)
Pig Design Patterns
Credits
Foreword
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

Chapter 5. Data Transformation Patterns

In the last chapter, you learned about various patterns related to data validation and cleansing from which you understood that there are ways to detect and remove incorrect or inaccurate records from the data. By the time data validation and cleansing is complete, the inconsistencies in the data are identified even before the data is used in the next steps of the analytics life cycle; then, the inconsistent data is replaced, modified, or deleted to make it more consistent.

In this chapter, you will learn about various design patterns related to data transformation, such as structured to hierarchical, normalization, integration, aggregation, and generalization design patterns.