Book Image

Machine Learning with BigQuery ML

By : Alessandro Marrandino

Book Image

Machine Learning with BigQuery ML

By: Alessandro Marrandino

Overview of this book

BigQuery ML enables you to easily build machine learning (ML) models with SQL without much coding. This book will help you to accelerate the development and deployment of ML models with BigQuery ML. The book starts with a quick overview of Google Cloud and BigQuery architecture. You'll then learn how to configure a Google Cloud project, understand the architectural components and capabilities of BigQuery, and find out how to build ML models with BigQuery ML. The book teaches you how to use ML using SQL on BigQuery. You'll analyze the key phases of a ML model's lifecycle and get to grips with the SQL statements used to train, evaluate, test, and use a model. As you advance, you'll build a series of use cases by applying different ML techniques such as linear regression, binary and multiclass logistic regression, k-means, ARIMA time series, deep neural networks, and XGBoost using practical use cases. Moving on, you'll cover matrix factorization and deep neural networks using BigQuery ML's capabilities. Finally, you'll explore the integration of BigQuery ML with other Google Cloud Platform components such as AI Platform Notebooks and TensorFlow along with discovering best practices and tips and tricks for hyperparameter tuning and performance enhancement. By the end of this BigQuery book, you'll be able to build and evaluate your own ML models with BigQuery ML.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Download the color images

Conventions used

Section 1: Introduction and Environment Setup

Section 1: Introduction and Environment Setup

Free Chapter

Chapter 1: Introduction to Google Cloud and BigQuery

Chapter 1: Introduction to Google Cloud and BigQuery

Introducing Google Cloud Platform

Exploring AI and ML services on GCP

Introducing BigQuery

Discovering BigQuery ML

Understanding BigQuery pricing

Further resources

Chapter 2: Setting Up Your GCP and BigQuery Environment

Chapter 2: Setting Up Your GCP and BigQuery Environment

Technical requirements

Creating your GCP account and project

Activating BigQuery

Discovering the BigQuery web UI

Exploring the BigQuery public datasets

Further reading

Chapter 3: Introducing BigQuery Syntax

Chapter 3: Introducing BigQuery Syntax

Technical requirements

Creating a BigQuery dataset

Discovering BigQuery SQL

Diving into BigQuery ML

Further resources

Section 2: Deep Learning Networks

Section 2: Deep Learning Networks

Chapter 4: Predicting Numerical Values with Linear Regression

Chapter 4: Predicting Numerical Values with Linear Regression

Technical requirements

Introducing the business scenario

Discovering linear regression

Exploring and understanding the dataset

Training the linear regression model

Evaluating the linear regression model

Utilizing the linear regression model

Drawing business conclusions

Further reading

Chapter 5: Predicting Boolean Values Using Binary Logistic Regression

Chapter 5: Predicting Boolean Values Using Binary Logistic Regression

Technical requirements

Introducing the business scenario

Discovering binary logistic regression

Exploring and understanding the dataset

Training the binary logistic regression model

Evaluating the binary logistic regression model

Using the binary logistic regression model

Drawing business conclusions

Further resources

Chapter 6: Classifying Trees with Multiclass Logistic Regression

Chapter 6: Classifying Trees with Multiclass Logistic Regression

Technical requirements

Introducing the business scenario

Discovering multiclass logistic regression

Exploring and understanding the dataset

Training the multiclass logistic regression model

Evaluating the multiclass logistic regression model

Using the multiclass logistic regression model

Drawing business conclusions

Further resources

Section 3: Advanced Models with BigQuery ML

Section 3: Advanced Models with BigQuery ML

Chapter 7: Clustering Using the K-Means Algorithm

Chapter 7: Clustering Using the K-Means Algorithm

Technical requirements

Introducing the business scenario

Discovering K-Means clustering

Exploring and understanding the dataset

Training the K-Means clustering model

Evaluating the K-Means clustering model

Using the K-Means clustering model

Drawing business conclusions

Further resources

Chapter 8: Forecasting Using Time Series

Chapter 8: Forecasting Using Time Series

Technical requirements

Introducing the business scenario

Discovering time series forecasting

Exploring and understanding the dataset

Training the time series forecasting model

Evaluating the time series forecasting model

Using the time series forecasting model

Presenting the forecast

Further resources

Chapter 9: Suggesting the Right Product by Using Matrix Factorization

Chapter 9: Suggesting the Right Product by Using Matrix Factorization

Technical requirements

Introducing the business scenario

Discovering matrix factorization

Configuring BigQuery Flex Slots

Exploring and preparing the dataset

Training the matrix factorization model

Evaluating the matrix factorization model

Using the matrix factorization model

Drawing business conclusions

Further resources

Chapter 10: Predicting Boolean Values Using XGBoost

Chapter 10: Predicting Boolean Values Using XGBoost

Technical requirements

Introducing the business scenario

Discovering the XGBoost Boosted Tree classification model

Exploring and understanding the dataset

Training the XGBoost classification model

Evaluating the XGBoost classification model

Using the XGBoost classification model

Drawing business conclusions

Further resources

Chapter 11: Implementing Deep Neural Networks

Chapter 11: Implementing Deep Neural Networks

Technical requirements

Introducing the business scenario

Discovering DNNs

Preparing the dataset

Training the DNN models

Evaluating the DNN models

Using the DNN models

Drawing business conclusions

Further resources

Section 4: Further Extending Your ML Capabilities with GCP

Section 4: Further Extending Your ML Capabilities with GCP

Chapter 12: Using BigQuery ML with AI Notebooks

Chapter 12: Using BigQuery ML with AI Notebooks

Technical requirements

Discovering AI Platform Notebooks

Implementing BigQuery ML models within notebooks

Further resources

Chapter 13: Running TensorFlow Models with BigQuery ML

Chapter 13: Running TensorFlow Models with BigQuery ML

Technical requirements

Introducing TensorFlow

Discovering the relationship between BigQuery ML and TensorFlow

Converting BigQuery ML models into TensorFlow

Running TensorFlow models with BigQuery ML

Further resources

Chapter 14: BigQuery ML Tips and Best Practices

Chapter 14: BigQuery ML Tips and Best Practices

Choosing the right BigQuery ML algorithm

Preparing the datasets

Understanding feature engineering

Tuning hyperparameters

Using BigQuery ML for online predictions

Further resources

Other Books You May Enjoy

Other Books You May Enjoy

Packt is searching for authors like you

Leave a review - let other readers know what you think

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

Discovering K-Means clustering

In this section, we'll understand what unsupervised learning is and we'll learn the basics of the K-Means clustering technique.

K-Means is an unsupervised learning algorithm that solves clustering problems. This technique is used to classify data into a set of classes. The letter k represents the number of clusters that are fixed a priori. For our business scenario, we'll use three different clusters.

Important note

While supervised learning is based on a prior knowledge of what the output values of labels should be in a training dataset, unsupervised learning does not leverage labeled datasets. Its goal is to infer the structure of data within a training dataset, without any prior knowledge of it.

Each cluster of data is characterized by a centroid. The centroid represents the midpoint of the cluster and is identified during the training stage and according to the features of the model.

After the training of the K-Means...