## Detecting anomalies using density estimation

In general, normal elements are more common than anomalous entries in any system. So, if the probability of the occurrence of elements in a collection is modeled by the Gaussian or normal distribution, then we can conclude that the elements for which the estimated probability density is more than a predefined threshold are normal, and those for which the value is less than a predefined threshold are probably anomalies.

Let's say that is a random variable of rows. The following couple of formulae find the average and standard deviations for feature , or, in other words, for all the elements of in the *j*th column if is represented as a matrix.

Given a new entry *x,* the following formula calculates the probability density estimation:

If is less than a predefined threshold, then the entry is tagged to be anomalous, else it is tagged as normal.

The following code finds the average value of the jth feature:

Here is a sample run of the `px`

method:

**>...**