Multiclassifiers
In the preceding chapters, we saw that multi-label datasets, where a tweet may have zero, one, or more labels, are considerably harder to deal with than simple multi-class datasets where each tweet has exactly one label, albeit drawn from a set of more than one option. In this chapter, we will investigate ways of dealing with these cases, looking in particular at the use of neutral as a label for handling cases where a tweet is allowed to have zero labels; at using varying thresholds to enable standard classifiers to return a variable number of labels; and at training multiple classifiers, one per label, and allowing them each to make a decision about the label they were trained for. The conclusion, as ever, will be that there is no single “silver bullet” that provides the best solution in every case, but in general, the use of multiple classifiers tends to be better than the other approaches.
In this chapter, we’ll cover the following topics...