Book Image

Test Driven Machine Learning

Book Image

Test Driven Machine Learning

Overview of this book

Table of Contents (16 chapters)
Test-Driven Machine Learning
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
2
Perceptively Testing a Perceptron
Index

The problem with straight bootstrapping


What you could see happening was that with a single observation of data, bootstrapping will give the same answer every time. Ironically, this means that when you're bootstrapping such a small dataset, you will have zero variance. Here's an example in code:

plt.hist([np.random.choice([1]) for i in range(100)])

The histogram for sampling from a dataset that consists of only one element looks as follows:

As predicted, every value is the same. This doesn't really match our intuition about uncertainty though. We have only observed a single number, but it could have just as easily been a different number. This technique doesn't capture it right now. So, how can we fix this? By throwing in a random number of course! It's not terribly academic, but hopefully the tests will reveal that the performance will farewell. Here's the same scenario with the improved bootstrap:

As you can see in this visualization, rather than all of the distribution being focused in a...