Book Image

MySQL 8 for Big Data

By : Shabbir Challawala, Chintan Mehta, Kandarp Patel, Jaydip Lakhatariya
Book Image

MySQL 8 for Big Data

By: Shabbir Challawala, Chintan Mehta, Kandarp Patel, Jaydip Lakhatariya

Overview of this book

With organizations handling large amounts of data on a regular basis, MySQL has become a popular solution to handle this structured Big Data. In this book, you will see how DBAs can use MySQL 8 to handle billions of records, and load and retrieve data with performance comparable or superior to commercial DB solutions with higher costs. Many organizations today depend on MySQL for their websites and a Big Data solution for their data archiving, storage, and analysis needs. However, integrating them can be challenging. This book will show you how to implement a successful Big Data strategy with Apache Hadoop and MySQL 8. It will cover real-time use case scenario to explain integration and achieve Big Data solutions using technologies such as Apache Hadoop, Apache Sqoop, and MySQL Applier. Also, the book includes case studies on Apache Sqoop and real-time event processing. By the end of this book, you will know how to efficiently use MySQL 8 to manage data for your Big Data applications.
Table of Contents (17 chapters)
Title Page
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Customer Feedback
Preface

New features in MySQL 8


The MySQL database development team has recently announced its major release as MySQL 8 Development Milestone Release (DMR) with significant updates and fixes for problems that were much needed in this change of Big Data.

You might be wondering why it's 8 after 5.7! Did the intermediate versions, that is, 6 and 7, missed out? Of course this was not the case; actually 6.0 was preserved as part of the changeover to a more frequent and timely release while 7.0 for the clustering version of MySQL.

Let's see some exciting features introduced in this latest version, as depicted in the following diagram:

It's time to look at MySQL 8 features in detail, which makes us excited and convinced about the reasons for a major version upgrade of MySQL.

Transactional data dictionary

Up until the previous version, the MySQL data dictionary was stored in different metadata files and non-transactional tables, but from this version, it will have a transactional data dictionary to store the information about the database. No more .frm, .trg, or .par files. All information will be stored in the database, which removes the cost of performing heavy file operations. There were numerous issues with filesystem metadata storage like the vulnerability of the filesystem, exorbitant file operations, difficult to handle crash recovery failures, or replication; it was also difficult to add new feature-related metadata. Now this upgrade has made it simple by storing information in a centralized manner, and will have improved performance as this data dictionary object can be cached in memory, similar to other database objects.

This data dictionary will have data that is needed for SQL query execution such as catalog information, character sets, collations, column types, indexes, database information, tables, store procedures, functions and triggers, and so on.

Roles

In MySQL 8, the privileges module has been improved by introducing roles, which means a collection of permissions. Now we can create roles with a number of privileges and assign to multiple users.

The problem with the previous version was that we were not able to define generic permissions for a group of users and each user has individual privileges. Suppose if there are 1,000 users already existing that have common privileges, and you want to remove the write permissions for all of these 1,000 users, what would you have done in the previous version? You would take the lengthy approach of updating each and every user, right? Arrgh... that's a time-consuming task.

Now with MySQL 8, it is easy to update any change in privileges. Roles will define all the required privileges and this role will be assigned to those 1,000 users. We just need to make any privilege changes in the role and all users will automatically inherit the respective privileges.

Roles can be created, deleted, grant or revoked permission, granted or revoked from the user account, and can specify the default role within the current session.

InnoDB auto increment

MySQL 8 has changed the auto-increment counter value store mechanism. Previously, it was stored in the memory, which was quite difficult to manage during server restarts or server crashes. However, now the auto-increment counter value is written into the redo log whenever the value gets changed and, on each checkpoint, it will be saved in the system table, which makes it persistent across the server restart.

With the previous version, updating the auto-increment value may have caused duplicate entry errors. Suppose if you updated the value of auto-increment in the middle of the sequence with a larger than the current maximum value, then but subsequent insert operations could not identify the unused values, which could cause a duplicate entry issue. This has been prevented by persisting the auto-increment value, hence subsequent insert operations can get the new value and allocate it properly.

If server restart happened, the auto-increment value was lost with the previous version as it was stored in memory and InnoDB needed to execute a query to find out the maximum used value. This has been changed, as the newer version has the capability to persist its value across the server restart. During the server restart, InnoDB initializes the counter value in memory using the maximum value stored in the data dictionary table. In case of server crashes, InnoDB initializes the auto-increment counter value that is bigger than the data dictionary table and the redo log.

Supporting invisible indexes

MySQL 8 provides you with a feature to make indexes invisible. These kind of indexes cannot be used by the optimizer. In case you want to test the query performance without indexes, using this feature you can do so by making them invisible rather than dropping and re-adding an index. This is a pretty handy feature when indexing is supposed to be dropped and recreated on huge datasets.

All indexes are visible by default. To make them invisible or visible, INVISIBLE and VISIBLE keywords are used respectively, as described in the following code snippet:

ALTER TABLE table1 ALTER INDEX ix_table1_col1 INVISIBLE;
ALTER TABLE table1 ALTER INDEX ix_table1_col1 VISIBLE;

Improving descending indexes

Descending indexes existed in version 5.7 too, but they were scanned in reverse order, which caused performance barriers. To improve performance, MySQL 8 has optimized this and scanned descending indexes in forward order, which has drastically improved performance. It also brings multiple column indexes for the optimizer when the most efficient scan order has ascending order for some columns, and descending order for other columns.

SET PERSIST

Server variables can be configured globally and dynamically while the server is running. There are numerous system variables that we can set using SET GLOBAL:

SET GLOBAL max_connections = 1000;

However, such settings will be lost after server restart. To avoid this, MySQL 8 has introduced the SET PERSIST variant, which preserves variables across a server restart:

SET PERSIST max_connections = 1000;

Expanded GIS support

Until the previous version, it supported only one coordinate system, a unitless 2D place that was not referenced to a position on earth. Now MySQL 8 has added support for a Spatial Reference System (SRS) with geo-referenced ellipsoids and 2D projections. SRS helps assign coordinates to a location and establishes relationships between sets of such coordinates. This spatial data can be managed in data dictionary storage as the ST_SPATIAL_REFERENCE_SYSTEMS table.

The default character set

The default character set has been changed from latin1 to UTF8. UTF8 is the dominating character set, though it hadn't been a default one in prior versions of MySQL. Along with the character set default, collation has been changed from latin1_swedish_ci to utf8mb4_800_ci_ai. With these changes globally accepted, character sets and collations are now based on UTF8; one of the common reasons is because there are around 21 different languages supported by UTF8, which makes systems provide multilingual support.

Extended bit-wise operations

In MySQL 5.7, bit-wise operations and functions were working for BigInt (64-bit integer) data types only. We needed to pass BIGINT as an argument and it would return the result as BIGINT. In short, it had maximum range up to 64 bits to perform operations. A user needs to do conversion to the BIGINT data type in case they want to perform it on other data types. This typecasting was not feasible for data types larger than 64 bits as it would truncate the actual value, which resulted in inaccuracy of data.

MySQL 8 has improved bit-wise operations by enabling support for other binary data types such as Binary, VarBinary, and BLOB. This makes it possible to perform bit-wise operations on larger than 64-bit data. No more typecasting needed! This allows the taking of arguments and returning results larger than 64 bits.

InnoDB Memcached

Multiple get operations are now possible with the InnoDB Memcached plugin, which will really help in improving the read performance. Now, multiple key value pairs can be fetched in a single Memcached query. Frequent communication traffic has also been minimized as we can get multiple data in a single shot. We will be referring the Memcached plugin in detail in Chapter 4, Using Memcached with MySQL 8.

Range queries are also supported by the InnoDB Memcached plugin. It simplifies range searches by specifying a particular range and retrieves values within this range.

NOWAIT and SKIP LOCKED

When rows are locked by other transactions that you are trying to access, then you need to wait for that transaction to release the lock on the same row so that you can access it accordingly. To avoid waiting for the other transaction, InnoDB has added support of the NOWAIT and SKIP LOCKED options. NOWAIT will return immediately with an error in case the requested row is locked rather than going into the waiting mode, and SKIP LOCKED will skip the locked row and never wait to acquire the row lock. Hence, SKIP LOCKED will not consider the locked row in the resulting set:

SELECT * FROM table1 WHERE id = 5 FOR UPDATE NOWAIT;
SELECT * FROM table1 FOR UPDATE SKIP LOCKED;