Book Image

MySQL 8 for Big Data

By : Shabbir Challawala, Chintan Mehta, Kandarp Patel, Jaydip Lakhatariya
Book Image

MySQL 8 for Big Data

By: Shabbir Challawala, Chintan Mehta, Kandarp Patel, Jaydip Lakhatariya

Overview of this book

With organizations handling large amounts of data on a regular basis, MySQL has become a popular solution to handle this structured Big Data. In this book, you will see how DBAs can use MySQL 8 to handle billions of records, and load and retrieve data with performance comparable or superior to commercial DB solutions with higher costs. Many organizations today depend on MySQL for their websites and a Big Data solution for their data archiving, storage, and analysis needs. However, integrating them can be challenging. This book will show you how to implement a successful Big Data strategy with Apache Hadoop and MySQL 8. It will cover real-time use case scenario to explain integration and achieve Big Data solutions using technologies such as Apache Hadoop, Apache Sqoop, and MySQL Applier. Also, the book includes case studies on Apache Sqoop and real-time event processing. By the end of this book, you will know how to efficiently use MySQL 8 to manage data for your Big Data applications.
Table of Contents (17 chapters)
Title Page
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Customer Feedback
Preface

Organizing and analyzing data in Hadoop


As we learned in the Chapter 9, Case study: Part I - Apache Sqoop for exchanging data between MySQL and Hadoop, Hadoop can be used for processing unstructured data generated through relational databases like MySQL. In this topic, we will find out how we can use Hadoop for analyzing the unstructured data generated in MySQL 8. Based on our case study of e-commerce store, we will try to find out the bestselling product among the customers based on the order history of customers in e-commerce store. We will transfer the order data generated in MySQL 8 into Apache Hive using MySQL applier. Than we will use Hive Query Language (Hive-QL) for analyzing required data.

Hive-QL uses map-reduce algorithm which makes it much faster to analyze millions of data within seconds. Data generated in Hive can be transferred back to the MySQL 8 as a flat table.

Consider following table of user's order history generated in MySQL 8:

CREATE TABLE IF NOT EXISTS `orderHistory`...