Book Image

Go Machine Learning Projects

By : Xuanyi Chew
Book Image

Go Machine Learning Projects

By: Xuanyi Chew

Overview of this book

Go is the perfect language for machine learning; it helps to clearly describe complex algorithms, and also helps developers to understand how to run efficient optimized code. This book will teach you how to implement machine learning in Go to make programs that are easy to deploy and code that is not only easy to understand and debug, but also to have its performance measured. The book begins by guiding you through setting up your machine learning environment with Go libraries and capabilities. You will then plunge into regression analysis of a real-life house pricing dataset and build a classification model in Go to classify emails as spam or ham. Using Gonum, Gorgonia, and STL, you will explore time series analysis along with decomposition and clean up your personal Twitter timeline by clustering tweets. In addition to this, you will learn how to recognize handwriting using neural networks and convolutional neural networks. Lastly, you'll learn how to choose the most appropriate machine learning algorithms to use for your projects with the help of a facial detection project. By the end of this book, you will have developed a solid machine learning mindset, a strong hold on the powerful Go toolkit, and a sound understanding of the practical implementations of machine learning algorithms in real-world projects.
Table of Contents (12 chapters)

Data massage

When we tested that the data structure made sense, we printed the FullText field. We wish to cluster based on the content of the tweet. What matters to us is that content. This can be found in the FullText field of the struct. Later on in the chapter, we will see how we may use the metadata of the tweets, such as location, to help cluster the tweets better.

As mentioned in the previous sections, each individual tweet needs to be represented as a coordinate in some higher-dimensional space. Thus, our goal is to take all the tweets in a timeline and preprocess them in such a way that we can get this output table:

| Tweet ID | twitter | test | right | wrong |
|:--------:|:------:|:----:|:----:|:---:|
| 1 | 0 | 1 | 0 | 0 |
| 2 | 1 | 0 | 0 | 0 |
| 3 | 0 | 0 | 1 | 1 |

Each row in the table represents a tweet, indexed by the tweet ID. The columns that follow are words that...