Book Image

Pig Design Patterns

By : Pradeep Pasupuleti
Book Image

Pig Design Patterns

By: Pradeep Pasupuleti

Overview of this book

Table of Contents (16 chapters)
Pig Design Patterns
Credits
Foreword
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

Types of data in the enterprise


The following section details the enterprise-centric view of data and its relevance to the Big Data processing stack as depicted in the following diagram:

Data Variety in the enterprise

The following is an explanation of various categories of high-volume data:

  • Legacy data: This data type includes data from all legacy systems and applications, encompassing the structured and semi-structured formats of data stored online or offline. There are lots of use cases for data types—seismic data, hurricane data, census data, urban planning data, and socioeconomic data. These types can be ingested into Hadoop and combined with the master data to create interesting, predictive mash-ups.

  • Transactional (OLTP) data: Data from the transactional systems is traditionally loaded to the data warehouse. The advent of Hadoop has addressed the lack of extreme scalability in traditional systems; thus, transactional data is often modeled so that not all the data from the source systems...