Book Image

Python Machine Learning Blueprints: Intuitive data projects you can relate to

By : Alexander T. Combs
Book Image

Python Machine Learning Blueprints: Intuitive data projects you can relate to

By: Alexander T. Combs

Overview of this book

<p>Machine Learning is transforming the way we understand and interact with the world around us. But how much do you really understand it? How confident are you interacting with the tools and models that drive it?</p> <p>Python Machine Learning Blueprints puts your skills and knowledge to the test, guiding you through the development of some awesome machine learning applications and algorithms with real-world examples that demonstrate how to put concepts into practice.</p> <p>You’ll learn how to use cluster techniques to discover bargain air fares, and apply linear regression to find yourself a cheap apartment – and much more. Everything you learn is backed by a real-world example, whether its data manipulation or statistical modelling.</p> <p>That way you’re never left floundering in theory – you’ll be simply collecting and analyzing data in a way that makes a real impact.</p>
Table of Contents (16 chapters)
Python Machine Learning Blueprints
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface

Building a predictive content scoring model


Let's now use what we learned to create a model that can estimate the share counts for a given piece of content. We'll use the features that we have already created, as well as a few additional features.

Ideally, we would have a much larger sample of content, especially content that had more typical share counts. Despite this, we'll make do with what we have here.

We're going to use an algorithm called random forest regression. In prior chapters, we looked at a more typical implementation of random forests, which is based upon classification. Here, we're going to use a regression and attempt to predict the share counts. We could bucket our share classes into ranges, but it is preferable to use regression when dealing with continuous variables.

To begin, we'll create a bare-bones model. We'll use the number of images, the site, and the word count. We'll train our model on the number of Facebook likes.

We'll first import the sci-kit learn library, then...