Book Image

MySQL 8 for Big Data

By : Shabbir Challawala, Chintan Mehta, Kandarp Patel, Jaydip Lakhatariya
Book Image

MySQL 8 for Big Data

By: Shabbir Challawala, Chintan Mehta, Kandarp Patel, Jaydip Lakhatariya

Overview of this book

With organizations handling large amounts of data on a regular basis, MySQL has become a popular solution to handle this structured Big Data. In this book, you will see how DBAs can use MySQL 8 to handle billions of records, and load and retrieve data with performance comparable or superior to commercial DB solutions with higher costs. Many organizations today depend on MySQL for their websites and a Big Data solution for their data archiving, storage, and analysis needs. However, integrating them can be challenging. This book will show you how to implement a successful Big Data strategy with Apache Hadoop and MySQL 8. It will cover real-time use case scenario to explain integration and achieve Big Data solutions using technologies such as Apache Hadoop, Apache Sqoop, and MySQL Applier. Also, the book includes case studies on Apache Sqoop and real-time event processing. By the end of this book, you will know how to efficiently use MySQL 8 to manage data for your Big Data applications.
Table of Contents (17 chapters)
Title Page
About the Authors
About the Reviewers
Customer Feedback

The importance of Big Data

The importance of Big Data doesn't depend only on how much data you have, it's rather what you are going to do with the data. Data can be sourced and analyzed from unpredictable sources and can be used to address many things. Let's see use cases with real-life importance made on renowned scenarios with the help of Big Data.

The following image helps us understand a Big Data solution serving various industries. Though it's not an extensive list of industries where Big Data has been playing a prominent role in business decisions, let's discuss a few of the industries:

Social media

Social media content is information, and so are engagements such as views, likes, demographics, shares, follows, unique visitors, comments, and downloads. So, in regards to social media and Big Data, they are interrelated. At the end of the day, what matters is how your social media-related efforts contribute to business.


I came across one wonderful title: There's No Such Thing as Social Media ROI - It's Called Business ROI.

One notable example of Big Data possibilities on Facebook is providing insights about consumers lifestyles, search patterns, likes, demographics, purchasing habits, and so on. Facebook stores around 100PBs of data and piles up 500TB of data almost daily. Considering the number of subscribers and data collected, it is expected to be more than 60 zettabytes in the next three years. The more data you have, the more analysis you can have with sophisticated precision approaches for better Return on Investment (ROI). Information fetched from social media is also leveraged when targeting audiences for attractive and profitable ads.

Facebook has a service called Graph Search, which can help you do advanced searches with multiple criteria. For instance, you can search for people of male gender living in Ahmedabad who work with KNOWARTH Technologies. Google also helps you refine the search. Such searches and filters are not limited to these; it might also contain school, political views, age, and name. In the same way, you can also try for hotels, photos, songs, and more. So here, you have the business ROI of the Facebook company, which provides Facebook ad services which can be based on specific criteria such as regions, interests, or other specific features of user data. Google also provides a similar platform called Google AdWords.


The era of Big Data has been playing a significant role in politics too; political parties have been using various sources of data to target voters and better their election campaigns. Big Data analytics also made a significant contribution to the 2012 re-election of Barack Obama by enhancing engagement and speaking about the precise things that were significant for voters.

Narendra Modi is considered one of the most technology and social media-savvy politicians in the world! He has almost 500 million views on Google+, 30 million followers on Twitter, and 35 million likes on Facebook! Narendra Modi belongs to the Bhartiya Janta Party (BJP); Big Data analysis carried major responsibility for the BJP party and its associates for their successful Indian General Election in 2014, using open source tools that helped them get in direct touch with their voters. BJP reached their fluctuating voters and negative voters too, as they kept monitoring social media conversations and accordingly sent messages and used tactics to improve their vision for the election campaign.

Narendra Modi made a statement about prioritizing toilets before temples seven months earlier, after which the digital team closely monitored social media conversations around this. It was noticed that at least 50% of users were in line with the statement. This was when the opportunity to win the hearts of voters was converted to the mission of Swacch Bharat, which means hygienic India. The results were astonishing; BJP party support rose to around 30% in merely 50 hours.

Science and research

Did you know that with the help of Big Data, human genome decoding, which actually took 10 years to process, is now decoded in hardly a day, and there is almost a 100 times reduction in cost predicted by Moore's Law? Back in the year 2000, when the Sloan Digital Sky Survey (SDSS) started gathering astronomical data, it was with a rate of around 200 GB per night, which, at that time, was much higher than the data collected in astronomy history.

National Aeronautics and Space Administration (NASA) uses Big Data extensively considering the huge amount of science and research done. NASA gathers data from across the solar system to reveal unknown information about the universe; its massive collection of data is a prominent asset for science and research, and has been a benefit to humankind in diverse ways. The way NASA fetches data, stores it, and uses it in effective ways is enormous. There are so many use cases of NASA that it would be difficult to elaborate here!

Power and energy

One of the leading energy management companies that helps improve energy consumption with the help of Big Data predictive analysis, which helps build stronger relationships and retaining of customers. This company connects with more than 150 utilities and serves more than 35 million household customers to improve energy usage and reduce costs and carbon emissions. It also provides analytical reports to utility providers, from more than 10 million data points each day, for a holistic overview of usage for analysis. Household customers get these reports in invoices, which provide areas where energy usage can be reduced and directly helps consumers optimize energy costs.

Fraud detection

When it comes to security, fraud detection, or compliance, then Big Data is your soulmate, and precisely if your soulmate helps you in identifying and preventing issues before they strike, then it becomes a sweet spot for business. Most of the time, fraud detection happens a long time after the fraud has happened, when you might have already been damaged. The next steps would be obviously to minimize the impact and improve areas that could help you prevent this from being repeated.

Many companies who are into any type of transaction processing or claims are using fraud detection techniques extensively. Big Data platforms help them analyze transactions, claims, and so on in real-time, along with trends or anomalous behavior to prevent fraudulent activities.

The National Security Agency (NSA) also does Big Data analytics to foil terrorist plans. With the help of advanced Big Data fraudulent techniques, many security agencies use Big Data tools to predict criminal activity, credit card fraud, catch criminals, and prevent cyber attacks, among others. Day by day, as security, compliance, and fraud change their patterns, accordingly security agencies and fraud transaction techniques are becoming richer to keep a step ahead for such unwanted scenarios.


Nowadays, a wrist-based health tracker is a very common thing; however, with the help of Big Data, it not only shows your personal dashboard or changes over time, but also gives you relevant suggestions based on the medical data it collects to improve your diet, and analytic facts about people like you. So, from simple wrist-based health trackers, there are a lot of signs that can improve the healthcare of a patient. Companies providing these kinds of services also analyze how health is impacted by analyzing trends. Gradually, such wearables are also being used in Critical Care Units to quickly analyze the trend of doctors' immediate remediations.


By leveraging data accumulated from government agencies, social services files, accident reports, and clinical data, hospitals can help evaluate healthcare needs. Geographical statistics based on numerous factors, from population growth and disease rate to enhancing the quality of human life, are compared to determine the availability of medical services, ambulances, emergency services, pandemic plans, and other relevant health services. This can unbox probable environmental hazards, health risks, and trends that are being done by few agencies on a regular basis to forecast flu epidemics.

Business mapping

Netflix has millions of subscribers; it uses Big Data and analytics about a subscriber's habits based on age, gender, and geographical location to customize, which has proven to generate more business as per its expectations.

Amazon, back in 2011, started awarding $5 to its customers who use the Amazon Price Check Mobile App--scanning products in the store, grab a picture, and searching to find the lowest prices. It also had a feature to submit the in-store price for the products. It was then Big Data's role to have all the information on products could can be compared with Amazon products for price comparison and customer trends, and accordingly plan marketing campaigns and offers based on valuable data that was collected to dominate a rapidly developing e-commerce competitive market.

McDonalds has more than 35,000 local restaurants that cater to around 75 million customers in more than 120 countries. It uses Big Data to gain insights to improve customer experience and offers McDonalds key factors such as menu, queue timings, order size, and the pattern of orders by customers, which helps them optimize the effectiveness of their operations and customization based on geographical locations for lucrative business.

There are many real-world Big Data use cases that have changed humanity, technology, predictions, health, science and research, law and order, sports, customer experience, power and energy, financial trading, robotics, and many more fields. Big Data is an integral part of our daily routine, which is not evident all the time, but yes, it plays a significant role in the back to what we do in many ways. It's time to start looking in detail at how the life cycle of Big Data is structured, which would give an inside story of many areas that play a significant role in getting data to a place that might be used for processing.