Book Image

Test Driven Machine Learning

Book Image

Test Driven Machine Learning

Overview of this book

Table of Contents (16 chapters)
Test-Driven Machine Learning
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
2
Perceptively Testing a Perceptron
Index

Generating logistic data


A critical aspect of test driving our process is being in control. In the last chapter, we fitted a model to a pregenerated set of test data, and tried to guess what the beta coefficients were. In this chapter, we'll start generating a very simple dataset, and then we'll compute the estimates for the coefficients that we'll use. This will help us understand how this all comes together so that we can be sure that we're driving our code in the right direction.

Here is how we can generate some simple data:

import pandas
import statsmodels.formula.api as smf
import numpy as np

observation_count = 1000
intercept = -1.6
beta1 = 0.03
x = np.random.uniform(0, 100, size=observation_count)
x_prime = [np.exp(intercept + beta1 * x_i) / (1 + np.exp(intercept + beta1 * x_i)) for x_i in x]
y = [np.random.binomial(1, x_prime_i, size=1)[0] for x_prime_i in x_prime]
df = pandas.DataFrame({'x':x, 'y':y})

We will sample the data from a binomial distribution, because its values stick between...