Book Image

Big Data Analytics

By : Venkat Ankam
Book Image

Big Data Analytics

By: Venkat Ankam

Overview of this book

Big Data Analytics book aims at providing the fundamentals of Apache Spark and Hadoop. All Spark components – Spark Core, Spark SQL, DataFrames, Data sets, Conventional Streaming, Structured Streaming, MLlib, Graphx and Hadoop core components – HDFS, MapReduce and Yarn are explored in greater depth with implementation examples on Spark + Hadoop clusters. It is moving away from MapReduce to Spark. So, advantages of Spark over MapReduce are explained at great depth to reap benefits of in-memory speeds. DataFrames API, Data Sources API and new Data set API are explained for building Big Data analytical applications. Real-time data analytics using Spark Streaming with Apache Kafka and HBase is covered to help building streaming applications. New Structured streaming concept is explained with an IOT (Internet of Things) use case. Machine learning techniques are covered using MLLib, ML Pipelines and SparkR and Graph Analytics are covered with GraphX and GraphFrames components of Spark. Readers will also get an opportunity to get started with web based notebooks such as Jupyter, Apache Zeppelin and data flow tool Apache NiFi to analyze and visualize data.
Table of Contents (18 chapters)
Big Data Analytics
Credits
About the Author
Acknowledgement
About the Reviewers
www.PacktPub.com
Preface
Index

Real-life use cases


Let's take a look at different kinds of use cases for Big Data analytics. Broadly, Big Data analytics use cases are classified into the following five categories:

  • Customer analytics: Data-driven customer insights are necessary to deepen relationships and improve revenue.

  • Operational analytics: Performance and high service quality are the keys to maintaining customers in any industry, from manufacturing to health services.

  • Data-driven products and services: New products and services that align with growing business demands.

  • Enterprise Data Warehouse (EDW) optimization: Early data adopters have warehouse architectures that are 20 years old. Businesses modernize EDW architectures in order to handle the data deluge.

  • Domain-specific solutions: Domain-specific solutions provide businesses with an effective way to implement new features or adhere to industry compliance.

The following table shows you typical use cases of Big Data analytics:

Problem class

Use cases

Data analytics or data science?

Customer analytics

A 360-degree view of the customer

Data analytics and data science

Call center analytics

Data analytics and data science

Sentiment analytics

Data science

Recommendation engine (for example, the next best action)

Data science

Operational analytics

Log analytics

Data analytics

Call center analytics

Data analytics

Unstructured data management

Data analytics

Document management

Data analytics

Network analytics

Data analytics and data science

Preventive maintenance

Data science

Geospatial data management

Data analytics and data science

IOT Analytics

Data analytics and data science

Data-driven products and services

Metadata management

Data analytics

Operational data services

Data analytics

Data/Big Data environments

Data analytics

Data marketplaces

Data analytics

Third-party data management

Data analytics

EDW optimization

Data warehouse offload

Data analytics

Structured Big Data lake

Data analytics

Licensing cost mitigation

Data analytics

Cloud data architectures

Data analytics

Software assessments and migrations

Data analytics

Domain-specific solutions

Fraud and compliance

Data analytics and data science

Industry-specific domain models

Data analytics

Data sourcing and integration

Data analytics

Metrics and reporting solutions

Data analytics

Turnkey warehousing solutions

Data analytics