For disease diagnosis, we are going to use the free dataset proben1, which is available on the web (http://www.filewatcher.com/m/proben1.tar.gz.1782734-0.html). Proben1 is a benchmark set of several datasets from different domains. We are going to use the cancer and the diabetes dataset. We added two new classes to run the experiments of each case: CancerDisease
and DiabetesDisease
.
Ten variables compose the breast cancer dataset, where nine are inputs and one is a binary output. The dataset has 699 records, but we excluded 16 from them, which were found to be incomplete; thus, we used 683 records to train and test a neural network.
Tip
In real practical problems, it is common to have missing or invalid data. Ideally, the classification algorithm must handle these records, but sometimes, it is recommended to exclude them since there would not be information to produce an accurate result.
The following table shows the configuration...