After going through this chapter, we are now able to understand why and when to use big data instead of a traditional relational database. We also understand the difference between batch processing, real-time processing, and stream processing. We got familiar with the Hadoop ecosystem, especially Hive. We have also gone back in time and brushed through the history of database and warehouse to big data along with some big data terms, the Hadoop ecosystem, Hive architecture, and the advantage of using Hive. In the next chapter, we will practice setting up Hive and all the tools needed to get started using Hive in the command line.
Apache Hive Essentials
By :
Apache Hive Essentials
By:
Overview of this book
Table of Contents (17 chapters)
Apache Hive Essentials
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Free Chapter
Overview of Big Data and Hive
Setting Up the Hive Environment
Data Definition and Description
Data Selection and Scope
Data Manipulation
Data Aggregation and Sampling
Performance Considerations
Extensibility Considerations
Security Considerations
Working with Other Tools
Index
Customer Reviews