Using one-sided functions to your advantage
Many people realize the usefulness of one-sided functions in ML, such as
high_mean, to allow for the detection of anomalies only on the high side or on the low side. This is useful when you only care about a drop in revenue or a spike in response time.
However, when you care about deviations in both directions, you are often inclined to use just the regular function (such as
mean). However, on some datasets, it is more optimal to use both the high and low versions of the function as two separate detectors. Why is this the case and under what conditions, you might ask?
The condition where this makes sense is when the dynamic range of the possible deviations is asymmetrical. In other words, the magnitude of potential spikes in the data is far, far bigger than the magnitude of the potential drops, possibly because the count or sum of something cannot be less than zero. Let's look at the following screenshot...