Book Image

F# for Machine Learning Essentials

By : Sudipta Mukherjee
Book Image

F# for Machine Learning Essentials

By: Sudipta Mukherjee

Overview of this book

The F# functional programming language enables developers to write simple code to solve complex problems. With F#, developers create consistent and predictable programs that are easier to test and reuse, simpler to parallelize, and are less prone to bugs. If you want to learn how to use F# to build machine learning systems, then this is the book you want. Starting with an introduction to the several categories on machine learning, you will quickly learn to implement time-tested, supervised learning algorithms. You will gradually move on to solving problems on predicting housing pricing using Regression Analysis. You will then learn to use Accord.NET to implement SVM techniques and clustering. You will also learn to build a recommender system for your e-commerce site from scratch. Finally, you will dive into advanced topics such as implementing neural network algorithms while performing sentiment analysis on your data.
Table of Contents (16 chapters)
F# for Machine Learning Essentials
Credits
Foreword
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

Detecting point anomalies using Grubb's test


Grubb's test (also known as the maximum normed residual test) is used to detect anomalies in a univariate dataset (which means there is only one variable per data instance) under the assumption that the data is generated by a Gaussian distribution. For each test instance , its score is computed as follows:

Where is the average of the data in the instances and is the standard deviation of the data points.

The following functions determine the scores of each element in the list:

A data instance is declared to be anomalous if it fulfills the following condition:

Here, is the number of elements in the collection and is the threshold used to declare an instance to be anomalous or normal.

The following function finds the elements where the score indicates that the element might be anomalous. The xs parameter denotes the entire collection and t denotes the value of .

The following code shows you how to use these functions to find anomalous data instances...