Book Image

F# for Machine Learning Essentials

By : Sudipta Mukherjee
Book Image

F# for Machine Learning Essentials

By: Sudipta Mukherjee

Overview of this book

The F# functional programming language enables developers to write simple code to solve complex problems. With F#, developers create consistent and predictable programs that are easier to test and reuse, simpler to parallelize, and are less prone to bugs. If you want to learn how to use F# to build machine learning systems, then this is the book you want. Starting with an introduction to the several categories on machine learning, you will quickly learn to implement time-tested, supervised learning algorithms. You will gradually move on to solving problems on predicting housing pricing using Regression Analysis. You will then learn to use Accord.NET to implement SVM techniques and clustering. You will also learn to build a recommender system for your e-commerce site from scratch. Finally, you will dive into advanced topics such as implementing neural network algorithms while performing sentiment analysis on your data.
Table of Contents (16 chapters)
F# for Machine Learning Essentials
Credits
Foreword
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

Chi-squared statistic to determine anomalies


Ye and Chen used a statistic to determine anomalies in the operating system call data. The training phase assumes that the normal data has a multivariate normal distribution. The value of the statistic is determined as:

Where denotes the observed value of the ith variable, is the expected value of the ith variable (obtained from the training data), and n is the number of variables. A large value of denotes that the observed sample contains anomalies.

The following function calculates the respective values for all the elements in a collection:

When this function is called with the same data [1.;100.;2.;4.5;2.55;70.] as the observed data and [111.;100.;2.;4.5;2.55;710.] as the expected values then the following result is obtained:

[(1.0, 12100.0); (100.0, 0.0); (2.0, 0.0); (4.5, 0.0); (2.55, 0.0);
   (70.0, 5851.428571)]

As you can see, the value of is very high (121000.0 and 5851.428571) in the first and last observations. This means that the...