Book Image

Journey to Become a Google Cloud Machine Learning Engineer

By : Dr. Logan Song
Book Image

Journey to Become a Google Cloud Machine Learning Engineer

By: Dr. Logan Song

Overview of this book

This book aims to provide a study guide to learn and master machine learning in Google Cloud: to build a broad and strong knowledge base, train hands-on skills, and get certified as a Google Cloud Machine Learning Engineer. The book is for someone who has the basic Google Cloud Platform (GCP) knowledge and skills, and basic Python programming skills, and wants to learn machine learning in GCP to take their next step toward becoming a Google Cloud Certified Machine Learning professional. The book starts by laying the foundations of Google Cloud Platform and Python programming, followed the by building blocks of machine learning, then focusing on machine learning in Google Cloud, and finally ends the studying for the Google Cloud Machine Learning certification by integrating all the knowledge and skills together. The book is based on the graduate courses the author has been teaching at the University of Texas at Dallas. When going through the chapters, the reader is expected to study the concepts, complete the exercises, understand and practice the labs in the appendices, and study each exam question thoroughly. Then, at the end of the learning journey, you can expect to harvest the knowledge, skills, and a certificate.
Table of Contents (23 chapters)
1
Part 1: Starting with GCP and Python
4
Part 2: Introducing Machine Learning
8
Part 3: Mastering ML in GCP
13
Part 4: Accomplishing GCP ML Certification
15
Part 5: Appendices
Appendix 2: Practicing Using the Python Data Libraries

Data preparation

In the previous chapters, we discussed Python libraries such as NumPy, Pandas, Matplotlib, and Seaborn for processing and visualizing data. Let’s start with simply importing the libraries:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

We will use a simple dataset that has only 4 columns and 10 rows.

Notice that some of the columns are categorical and others are numerical, and some of them have missing values that we need to fix. The dataset .csv file is uploaded to Google Colab.

Using the pandas library and the read_csv function, we read the data and save it to a variable dataset, assign the first three columns (Country, Age, and Salary) as features, assign the feature dataset to X, and assign the last column dataset to y, as the prediction:

dataset = pd.read_csv('Data.csv')
X = dataset.iloc[:,:-1].values
y = dataset.iloc[:, -1].values
print(X)
[['France...