Predictive Analytics with TensorFlow

Predictive Analytics with TensorFlow

By : Md. Rezaul Karim

Buy this Book

Predictive Analytics with TensorFlow

By: Md. Rezaul Karim

Buy this Book

Overview of this book

Predictive analytics discovers hidden patterns from structured and unstructured data for automated decision-making in business intelligence. This book will help you build, tune, and deploy predictive models with TensorFlow in three main sections. The first section covers linear algebra, statistics, and probability theory for predictive modeling. The second section covers developing predictive models via supervised (classification and regression) and unsupervised (clustering) algorithms. It then explains how to develop predictive models for NLP and covers reinforcement learning algorithms. Lastly, this section covers developing a factorization machines-based recommendation system. The third section covers deep learning architectures for advanced predictive analytics, including deep neural networks and recurrent neural networks for high-dimensional and sequence data. Finally, convolutional neural networks are used for predictive modeling for emotion recognition, image classification, and sentiment analysis.

Predictive Analytics with TensorFlow

Credits

About the Author

Acknowledgments

About the Reviewers

www.PacktPub.com

Customer Feedback

Preface

Free Chapter

Basic Python and Linear Algebra for Predictive Analytics

A basic introduction to predictive analytics

A bit of linear algebra

Installing and getting started with Python

Getting started with Python

Vectors, matrices, and graphs

Span and linear independence

Principal component analysis

Singular value decomposition

Predictive analytics tools in Python

Summary

Statistics, Probability, and Information Theory for Predictive Modeling

Using statistics in predictive modeling

Basic probability for predictive modeling

Using information theory in predictive modeling

Summary

From Data to Decisions – Getting Started with TensorFlow

Taking decisions based on data - Titanic example

General overview of TensorFlow

Installing and configuring TensorFlow

TensorFlow computational graph

TensorFlow programming model

Data model in TensorFlow

TensorBoard

Getting started with TensorFlow – linear regression and beyond

Summary

Putting Data in Place - Supervised Learning for Predictive Analytics

Supervised learning for predictive analytics

Linear regression - revisited

From disaster to decision - Titanic example revisited

Summary

Clustering Your Data - Unsupervised Learning for Predictive Analytics

Unsupervised learning and clustering

Using K-means for predictive analytics

Predictive models for clustering audio files

Using kNN for predictive analytics

Summary

Predictive Analytics Pipelines for NLP

NLP analytics pipelines

Transformers and estimators

Using BOW for predictive analytics

TF-IDF model for predictive analytics

Using Word2vec for sentiment analysis

Summary

Using Deep Neural Networks for Predictive Analytics

Deep learning for better predictive analytics

Artificial Neural Networks

Deep Neural Networks

Multilayer perceptrons

DNN performance analysis

Fine-tuning DNN hyperparameters

Using multilayer perceptrons for predictive analytics

Deep belief networks

Using deep belief networks for predictive analytics

Summary

Using Convolutional Neural Networks for Predictive Analytics

CNNs and the drawbacks of regular DNNs

CNN architecture

Convolutional operations

Pooling layer and padding operations

Tuning CNN hyperparameters

CNN-based predictive model for sentiment analysis

CNN model for emotion recognition

CNN predictive model for image classification

Summary

Using Recurrent Neural Networks for Predictive Analytics

RNN architecture

Using BRNN for image classification

Implementing an RNN for spam prediction

Developing a predictive model for time series data

An LSTM predictive model for sentiment analysis

Summary

Recommendation Systems for Predictive Analytics

Recommendation systems

Collaborative filtering approach for movie recommendations

Factorization machines for recommendation systems

Improved factorization machines for predictive analytics

Summary

Using Reinforcement Learning for Predictive Analytics

Reinforcement learning

Reinforcement learning in predictive analytics

Notation, policy, and utility in RL

Developing a multiarmed bandit's predictive model

Developing a stock price predictive model

Summary

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Preface

The continued growth in data, coupled with the need to make increasingly complex decisions against that data, is creating massive hurdles that prevent organizations from deriving insights in a timely manner using traditional approaches. Machine learning is concerned with algorithms that transform raw data into information and then into actionable intelligence. This fact makes machine learning well suited to the predictive analytics. Without machine learning, therefore, it would be nearly impossible to keep up with these massive streams of information altogether.

On the other hand, deep learning is a branch of machine learning algorithms based on learning multiple levels of representation. A deep learning algorithm is nothing more than the implementation of a complex and deep neural network so that it can learn through the analysis of large amounts of data. Thus, it took just a few years to develop powerful deep learning algorithms to recognize images, natural language processing, and perform a myriad of other complex tasks.

Considering these motivations and requirements, this book is dedicated to developers, data analysts, machine learning practitioners, and deep learning enthusiasts who want to build powerful, robust, and accurate predictive models with the power of TensorFlow from scratch, and combining other open source Python libraries.

The first section of this book covers applied math, statistics, and probability theory for predictive analytics. It will then cover useful Python packages to getting started with data science in a practical manner. The second section shows how to develop large-scale predictive analytics pipelines using supervised learning algorithms, for example, classification and regression; and unsupervised learning algorithms, for example, clustering. It'll then demonstrate how to develop predictive models for NLP.

Finally, reinforcement learning and a factorization machine-based recommendation system will be used to develop predictive models. The third section covers practical mastery of deep learning architectures for advanced predictive analytics, including deep neural networks and recurrent neural networks for high-dimensional and sequence data. Finally, it'll show how to develop convolutional neural networks-based predictive models for emotion recognition, image classification, and sentiment analysis.

Happy Reading!

What this book covers

Chapter 1, Basic Python and Linear Algebra for Predictive Analytics, discusses the basic concepts in linear algebra for predictive analytics, such as vectors, matrices, tensors, linear dependence, and span. Then, we move on to a brief introduction to Principal Component Analysis (PCA) and Singular Value Decomposition (SVD). Finally, some predictive modeling tools in Python will be discussed.

Chapter 2, Statistics, Probability, and Information Theory for Predictive Modeling, covers some statistic, probabilistic, and information theory concepts before getting started on predictive analytics: random sampling, hypothesis testing, chi-square test, correlation, expectation, variance, covariance and Bayes' rule, and so on. It then discusses the central objects of probability theory: random variables, stochastic processes, and events. Information theory, which studies the quantification, storage, and communication of information, will be discussed at the end of the chapter.

Chapter 3, From Data to Decisions - Getting Started with TensorFlow, provides a detailed description of the main TensorFlow features in a real-life problem, followed by detailed discussions about TensorFlow installation and configuration. It then covers computation graphs, data, and programming models before getting started with TensorFlow. The last part of the chapter contains an example of implementing linear regression model for predictive analytics.

Chapter 4, Putting Data in Place - Supervised Learning for Predictive Analytics, covers some TensorFlow-based supervised learning techniques from a theoretical and practical perspective. In particular, the linear regression model for regression analysis will be covered on a real dataset. It then shows how we could solve the Titanic survival problem using logistic regression, random forests, and SVMs for predictive analytics.

Chapter 5, Clustering Your Data - Unsupervised Learning for Predictive Analytics, digs deeper into predictive analytics and finds out how we can take advantage of it to cluster records belonging to the certain group or class for a dataset of unsupervised observations. It will then provide some practical examples of unsupervised learning. Particularly, clustering techniques using TensorFlow will be discussed with some hands-on examples.

Chapter 6, Predictive Analytics Pipelines for NLP, shows how to use TensorFlow for text analytics with a focus on text classification from an unstructured spam prediction and movie review dataset. Based on the spam filtering dataset, it shows how to develop predictive models using a linear regression algorithm with TensorFlow. Particularly, it will use the bag-of-words (BOW) and TF-IDF algorithms for spam prediction. Later on, it will also show how to develop large-scale predictive models for predicting sentiment from the movie review dataset using the continuous bag-of-words (CBOW) and continuous skip-gram algorithms.

Chapter 7, Using Deep Neural Networks for Predictive Analytics, demonstrates how to train DNNs and analyze the performance metrics that are needed to evaluate a DNN predictive model. It also shows how to tune the hyperparameters for DNNs for better and optimized performance. It will provide two examples on how to build very robust and accurate predictive models for predictive analytics as well, in particular, using Deep Belief Networks (DBN) and Multilayer Perceptron (MLP) on a bank marketing dataset.

Chapter 8, Using Convolutional Neural Networks for Predictive Analytics, discusses how to develop predictive analytics applications such as emotion recognition, image classification, and text classification using the convolutional neural network algorithm on real image/text datasets. Finally, it will provide some pointers on how to tune and debug CNN-based networks for optimized performance.

Chapter 9, Using Recurrent Neural Networks for Predictive Analytics, provides some theoretical background for RNNs. Then, it shows a few examples of implementing predictive models for image classification, sentiment analysis of movies, and products spam prediction for NLP. Finally, it shows how to develop predictive models for time-series data.

Chapter 10, Recommendation System for Predictive Analytics, provides several examples of how to develop recommendation systems for predictive analytics followed by some theoretical background of recommendation systems, for example, matrix factorization. Later in the chapter, an example of developing movie recommendation engine using SVD and K-means will be shown. Finally, the chapter shows how we could use factorization machines to develop a more accurate and robust recommendation system.

Chapter 11, Using Reinforcement Learning for Predictive Analytics, talks about designing machine learning systems driven by criticism and rewards. It will show several examples of how to apply reinforcement learning algorithms for developing predictive models on real-life datasets.

What you need for this book

All the examples have been implemented in Python 2 and 3 with TensorFlow 1.2.0+. You will also need some additional software and tools. To be more specific, the following tools and libraries are required, preferably the latest version:

Python (2.7.x or 3.3+)
TensorFlow (1.0.0+)
Bazel (latest version)
pip/pip3 (latest version for Python 2 and 3 respectively)
matplotlib (latest version)
pandas (latest version)
NumPy (latest version)
SciPy (latest version)
sklearn (latest version)
yahoo_finance (latest version)
Bazel(latest version)
CUDA (latest version)
CuDNN (latest version)

Linux distributions are preferable (including Debian, Ubuntu, Fedora, RHEL, and CentOS) and to be more specific, for Ubuntu it is recommended to have the 14.04 (LTS) 64-bit (or later) complete installation or VMWare player 12 or VirtualBox. You can also run TensorFlow jobs on Windows (XP/7/8/10) or Mac OS X (10.4.7+).

Processor Core i5 or Core i7 with GPU support is recommended to get the best results. However, multicore processing would provide faster data processing and scalability of the predictive analytics jobs—at least 8 GB RAM (recommended) for a standalone mode and at least 32 GB RAM for a single VM and higher for a cluster. There is enough storage for running heavy jobs (depending on the dataset size you will be handling), preferably at least 50 GB of free disk storage.

Who this book is for

This book is dedicated to developers, data analysts, and deep learning enthusiasts who want to build powerful, robust, and accurate predictive models with the power of TensorFlow from scratch and in combination with other open source Python libraries. If you want to build your own extensive applications that work and can predict smart decisions in the future, then this book is what you need! A good command of object-oriented programming with Python is a prerequisite. Some competence in applied mathematics, statistics, linear algebra, and information theory is a plus and would help readers understand the concepts presented in this book.

Conventions

In this book, you will find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles, and an explanation of their meaning.

A block of code for importing necessary packages and libraries modules is set as follows:

#Import libraries (Numpy, Tensorflow, matplotlib)
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plot

When creating the session from the TensorFlow and do some computation, we used the following code segment:

with tf.Session() as sess:
      sess.run(tf.global_variables_initializer())
      writer = tf.summary.FileWriter(logs_path, graph=tf.get_default_graph())
      print("done")

Any command–line input or output is written as follows:

# cp /usr/src/asterisk-addons/configs/cdr_mysql.conf.sample
     /etc/asterisk/cdr_mysql.conf

New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes, for example, appear in the text like this: "Clicking the Next button moves you to the next screen."

Note

Warnings or important notes appear in a box like this.

Note

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.

To send us general feedback, simply send an e–mail to <[email protected]>, and mention the book title via the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e–mailed directly to you.

You can download the code files by following these steps:

Log in or register to our website using your e-mail address and password.
Hover the mouse pointer on the SUPPORT tab at the top.
Click on Code Downloads & Errata.
Enter the name of the book in the Search box.
Select the book for which you're looking to download the code files.
Choose from the drop-down menu where you purchased this book from.
Click on Code Download.

You can also download the code files by clicking on the Code Files button on the book's webpage at the Packt Publishing website. This page can be accessed by entering the book's name in the Search box. Please note that you need to be logged in to your Packt account.

Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:

WinRAR / 7-Zip for Windows
Zipeg / iZip / UnRarX for Mac
7-Zip / PeaZip for Linux

The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Predictive–Analytics–with–TensorFlow. We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from https://www.packtpub.com/sites/default/files/downloads/PredictiveAnalyticswithTensorFlow_ColorImages.pdf

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded on our website, or added to any list of existing errata, under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyright material on the internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors, and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.

Predictive Analytics with TensorFlow

By : Md. Rezaul Karim

Predictive Analytics with TensorFlow

By: Md. Rezaul Karim

Overview of this book

Related Content you might be interested in

Current Title:

Predictive Analytics with TensorFlow

Deep Learning with TensorFlow

Practical Convolutional Neural Networks

Scala Machine Learning Projects

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Note

Note

Reader feedback

Customer support

Downloading the example code

Downloading the color images of this book

Errata

Piracy

Questions