Any datapoint with a value that is very different from the other data points is an outlier. Outliers can affect the training process negatively and therefore they need to be handled gracefully. In the following section, we will illustrate via examples both the process of detecting an outlier and the techniques used to handle them.
The outlier package can detect the outlier values. Using the opposite=TRUE parameter will fetch the outliers from the other side of dataset. The outlier values can be verified using a boxplot.
- Attach the outlier package:
- Detect outliers:
The output is as follows:
pregnant glucose pressure triceps
17 0 0 99
Detect outliers from the other end: