Book Image

MySQL 8 for Big Data

By : Shabbir Challawala, Chintan Mehta, Kandarp Patel, Jaydip Lakhatariya
Book Image

MySQL 8 for Big Data

By: Shabbir Challawala, Chintan Mehta, Kandarp Patel, Jaydip Lakhatariya

Overview of this book

With organizations handling large amounts of data on a regular basis, MySQL has become a popular solution to handle this structured Big Data. In this book, you will see how DBAs can use MySQL 8 to handle billions of records, and load and retrieve data with performance comparable or superior to commercial DB solutions with higher costs. Many organizations today depend on MySQL for their websites and a Big Data solution for their data archiving, storage, and analysis needs. However, integrating them can be challenging. This book will show you how to implement a successful Big Data strategy with Apache Hadoop and MySQL 8. It will cover real-time use case scenario to explain integration and achieve Big Data solutions using technologies such as Apache Hadoop, Apache Sqoop, and MySQL Applier. Also, the book includes case studies on Apache Sqoop and real-time event processing. By the end of this book, you will know how to efficiently use MySQL 8 to manage data for your Big Data applications.
Table of Contents (17 chapters)
Title Page
About the Authors
About the Reviewers
Customer Feedback

Indexing JSON data

Big data consists of many documents that are stored in the JSON format, so it is necessary to have proper indexing on JSON data. There is no direct way to define an index on JSON data. We can always create columns from the JSON data and then leverage the generated columns to support indexes.

Let's first understand what generated columns are in MySQL.

Generated columns

Generated columns generally do not have actual information, but they store the information gathered from other columns of the table using some expressions or calculations.

Generated columns can be divided into two types: virtual generated columns and stored generated columns.

Virtual generated columns

Columns defined as virtual generated columns do not actually store the value but they are calculated on the fly. If there is no column type defined for generated columns, it would use virtual columns by default. InnoDB supports secondary indexes on virtually generated columns. As the value of virtual columns are calculated...