Test Driven Machine Learning

Book Image

Test Driven Machine Learning

Book Image

Test Driven Machine Learning

Overview of this book

Test-Driven Machine Learning

Test-Driven Machine Learning

Credits

About the Author

About the Author

About the Reviewers

About the Reviewers

www.PacktPub.com

www.PacktPub.com

Preface

Free Chapter

Introducing Test-Driven Machine Learning

Introducing Test-Driven Machine Learning

Test-driven development

Behavior-driven development

TDD applied to machine learning

Dealing with randomness

Different approaches to validating the improved models

Quantifying the classification models

Perceptively Testing a Perceptron

Perceptively Testing a Perceptron

Getting started

Exploring the Unknown with Multi-armed Bandits

Exploring the Unknown with Multi-armed Bandits

Understanding a bandit

Testing with simulation

Starting from scratch

Simulating real world situations

A randomized probability matching algorithm

A bootstrapping bandit

The problem with straight bootstrapping

Multi-armed armed bandit throw down

Predicting Values with Regression

Predicting Values with Regression

Refresher on advanced regression

Generating our own data

Building the foundations of our model

Cross-validating our model

Generating data

Making Decisions Black and White with Logistic Regression

Making Decisions Black and White with Logistic Regression

Generating logistic data

Measuring model accuracy

Generating a more complex example

Test driving our model

You're So Naïve, Bayes

You're So Naïve, Bayes

Gaussian classification by hand

Beginning the development

Optimizing by Choosing a New Algorithm

Optimizing by Choosing a New Algorithm

Upgrading the classifier

Applying our classifier

Upgrading to Random Forest

Exploring scikit-learn Test First

Exploring scikit-learn Test First

Test-driven design

Planning our journey

Getting choosey

Developing testable documentation

Bringing It All Together

Bringing It All Together

Starting at the highest level

What we've accomplished

Index

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

Building the foundations of our model

Let's start by pulling the model into Python and transforming it into a form that we can use. To do this, we will need two additional libraries. We will use Pandas to read from our generated CSV and statsmodel to run our statistical procedures. Both libraries are pretty powerful and full of features, and we will only be touching on a few of them so feel free to explore them further later.

To start off, let's make a test that will run a simple regression over one of the variables and show us the output. That should give us a good place to start. I'm keeping this in a unit testing structure because I know I want to test this code and just want to explore a bit to know exactly what to test for. This first step you could do in a one-off file, but I'm choosing to start with it so I can build from it:

import pandas
import statsmodels.formula.api as sm
import nose.tools as nt

def vanilla_model_test():
  df = pandas.read_csv('./generated_data.csv')
  model_fit...