Book Image

Big Data Analytics with Java

By : RAJAT MEHTA
Book Image

Big Data Analytics with Java

By: RAJAT MEHTA

Overview of this book

This book covers case studies such as sentiment analysis on a tweet dataset, recommendations on a movielens dataset, customer segmentation on an ecommerce dataset, and graph analysis on actual flights dataset. This book is an end-to-end guide to implement analytics on big data with Java. Java is the de facto language for major big data environments, including Hadoop. This book will teach you how to perform analytics on big data with production-friendly Java. This book basically divided into two sections. The first part is an introduction that will help the readers get acquainted with big data environments, whereas the second part will contain a hardcore discussion on all the concepts in analytics on big data. It will take you from data analysis and data visualization to the core concepts and advantages of machine learning, real-life usage of regression and classification using Naïve Bayes, a deep discussion on the concepts of clustering,and a review of simple neural networks on big data using deepLearning4j or plain Java Spark code. This book is a must-have book for Java developers who want to start learning big data analytics and want to use it in the real world.
Table of Contents (21 chapters)
Big Data Analytics with Java
Credits
About the Author
About the Reviewers
www.PacktPub.com
Customer Feedback
Preface
Free Chapter
1
Big Data Analytics with Java
8
Ensembling on Big Data
12
Real-Time Analytics on Big Data
Index

Summary


This chapter covered a lot of ground on two important topics. Firstly, we covered a popular probabilistic algorithm, Naive Bayes, and explained its concepts and showed how it uses bayes rule and conditional probability to make predictions about new data using a pre-trained model. We also mentioned why Naive Bayes is called Naive as it makes a Naive assumption that all its features are completely independent of each other, thereby occurrence of one feature does not impact the other in any way. Despite this it forms well as we saw in our sample application. In our sample application we learnt a technique called sentimental analysis for figuring out the opinion whether positive or negative from a piece of text.

In the next chapter, we will study another popular machine learning algorithm called decision tree. We will show how it is very similar to a flowchart and we will explain it using a sample loan approval application.