Book Image

Mastering Predictive Analytics with Python

By : Joseph Babcock
Book Image

Mastering Predictive Analytics with Python

By: Joseph Babcock

Overview of this book

The volume, diversity, and speed of data available has never been greater. Powerful machine learning methods can unlock the value in this information by finding complex relationships and unanticipated trends. Using the Python programming language, analysts can use these sophisticated methods to build scalable analytic applications to deliver insights that are of tremendous value to their organizations. In Mastering Predictive Analytics with Python, you will learn the process of turning raw data into powerful insights. Through case studies and code examples using popular open-source Python libraries, this book illustrates the complete development process for analytic applications and how to quickly apply these methods to your own data to create robust and scalable prediction services. Covering a wide range of algorithms for classification, regression, clustering, as well as cutting-edge techniques such as deep learning, this book illustrates not only how these methods work, but how to implement them in practice. You will learn to choose the right approach for your problem and how to develop engaging visualizations to bring the insights of predictive modeling to life
Table of Contents (16 chapters)
Mastering Predictive Analytics with Python
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface
Index

Summary


After finishing this chapter, you should now be able to describe the core components of an analytic pipeline and the ways in which they interact. We've also examined the differences between batch and streaming processes, and some of the use cases in which each type of application is well suited. We've also walked through examples using both paradigms and the design decisions needed at each step.

In the following sections we will develop the concepts previously described, and go into greater detail on some of the technical terms brought up in the case studies. In Chapter 2, Exploratory Data Analysis and Visualization in Python, we will introduce interactive data visualization and exploration using open source Python tools. Chapter 3, Finding Patterns in the Noise – Clustering and Unsupervised Learning, describes how to identify groups of related objects in a dataset using clustering methods, also known as unsupervised learning. In contrast, Chapter 4, Connecting the Dots with Models – Regression Methods, and Chapter 5, Putting Data in its Place – Classification Methods and Analysis, explore supervised learning, whether for continuous outcomes such as prices (using regression techniques in Chapters 4, Connecting the Dots with Models – Regression Methods), or categorical responses such as user sentiment (using classification models described in Chapter 5, Putting Data in its Place – Classification Methods and Analysis). Given a large number of features, or complex data such as text or image, we may benefit by performing dimensionality reduction, as described in Chapter 6, Words and Pixels – Working with Unstructured Data. Alternatively, we may fit textual or image data using more sophisticated models such as the deep neural networks covered in Chapter 7, Learning from the Bottom Up – Deep Networks and Unsupervised Features, which can capture complex interactions between input variables. In order to use these models in business applications, we will develop a web framework to deploy analytical solutions in Chapter 8, Sharing Models with Prediction Services, and describe ongoing monitoring and refinement of the system in Chapter 9, Reporting and Testing – Iterating on Analytic Systems.

Throughout, we will emphasize both how these methods work and practical tips for choosing between different approaches for various problems. Working through the code examples will illustrate the required components for building and maintaining an application for your own use case. With these preliminaries, let's dive next into some exploratory data analysis using notebooks: a powerful way to document and share analysis.