To begin with, let's start building an extremely simplistic bandit algorithm. The actual bandit class will have two methods: choose_treatment
and log_payout
. The first method will recommend the best treatment to choose, and log_payout
will be used to report back on how effective the recommended treatment was.
The simplest way to approach this algorithm from a test-driven perspective is to start with a single treatment so that the algorithm only has one thing to recommend. The code for this test looks like the following:
from nose.tools import assert_equal import simple_bandit def given_a_single_treatment_test(): bandit = simple_bandit.SimpleBandit(['A']) chosen_treatment = bandit.choose_treatment() assert_equal(chosen_treatment, 'A', 'Should choose the only available option.')
Again, we'll start from such a simple place that making the test pass will seem too easy:
class SimpleBandit: def __init__(self, treatments): self._treatments = treatments def choose_treatment...