Machine Learning with BigQuery ML

By : Alessandro Marrandino
By: Alessandro Marrandino

Overview of this book

BigQuery ML enables you to easily build machine learning (ML) models with SQL without much coding. This book will help you to accelerate the development and deployment of ML models with BigQuery ML. The book starts with a quick overview of Google Cloud and BigQuery architecture. You'll then learn how to configure a Google Cloud project, understand the architectural components and capabilities of BigQuery, and find out how to build ML models with BigQuery ML. The book teaches you how to use ML using SQL on BigQuery. You'll analyze the key phases of a ML model's lifecycle and get to grips with the SQL statements used to train, evaluate, test, and use a model. As you advance, you'll build a series of use cases by applying different ML techniques such as linear regression, binary and multiclass logistic regression, k-means, ARIMA time series, deep neural networks, and XGBoost using practical use cases. Moving on, you'll cover matrix factorization and deep neural networks using BigQuery ML's capabilities. Finally, you'll explore the integration of BigQuery ML with other Google Cloud Platform components such as AI Platform Notebooks and TensorFlow along with discovering best practices and tips and tricks for hyperparameter tuning and performance enhancement. By the end of this BigQuery book, you'll be able to build and evaluate your own ML models with BigQuery ML.
Exploring and understanding the dataset

Before diving into the machine learning implementation, it's necessary to analyze the data that is available for our use case. Since machine learning training is based on examples, we need to clearly understand what data to consider and check the quality of the available records.


Data scientists and business analysts spend a lot of time and resources getting a clear understanding of the datasets, checking their quality, and preparing them. Although these operations don't seem to be directly linked to the realization of a machine learning algorithm, they are essential if you wish to get solid results. The actual training of the model is the last mile of a longer journey that begins with comprehending data, the control of its quality, and preparing it.

Let's start by getting a clear understanding of the information that we have in our dataset to build our use case.

Understanding the data

To have a clear understanding...