Logistic regression presents an intimidating array of parameters to tweak for better performance, and working with it is a bit of black art. Having built thousands of these classifiers, we are still learning how to do it better. This recipe will point you in the general right direction, but the topic probably deserves its own book.
This recipe involves extensive changes to the source of src/com/lingpipe/chapter3/TuneLogRegParams.java
. We will just run one configuration of it here, with most of the exposition in the How it works… section.
Engage your IDE or type the following in the command line:
java -cp lingpipe-cookbook.1.0.jar:lib/lingpipe-4.1.0.jar:lib/opencsv-2.4.jar com.lingpipe.cookbook.chapter3.TuneLogRegParams
The system then responds with cross-validation output confusion matrix for our default data in
data/disney_e_n.csv
:reference\response \e,n, e 11,0, n 6,4,
Next, we will report on false positives...