Book Image

TensorFlow Developer Certificate Guide

By : Oluwole Fagbohun
4 (2)
Book Image

TensorFlow Developer Certificate Guide

4 (2)
By: Oluwole Fagbohun

Overview of this book

The TensorFlow Developer Certificate Guide is an indispensable resource for machine learning enthusiasts and data professionals seeking to master TensorFlow and validate their skills by earning the certification. This practical guide equips you with the skills and knowledge necessary to build robust deep learning models that effectively tackle real-world challenges across diverse industries. You’ll embark on a journey of skill acquisition through easy-to-follow, step-by-step explanations and practical examples, mastering the craft of building sophisticated models using TensorFlow 2.x and overcoming common hurdles such as overfitting and data augmentation. With this book, you’ll discover a wide range of practical applications, including computer vision, natural language processing, and time series prediction. To prepare you for the TensorFlow Developer Certificate exam, it offers comprehensive coverage of exam topics, including image classification, natural language processing (NLP), and time series analysis. With the TensorFlow certification, you’ll be primed to tackle a broad spectrum of business problems and advance your career in the exciting field of machine learning. Whether you are a novice or an experienced developer, this guide will propel you to achieve your aspirations and become a highly skilled TensorFlow professional.
Table of Contents (20 chapters)
1
Part 1 – Introduction to TensorFlow
6
Part 2 – Image Classification with TensorFlow
12
Part 3 – Natural Language Processing with TensorFlow
15
Part 4 – Time Series with TensorFlow

ML life cycle

Before embarking on any ML project, we must take into account some key components that can determine whether our project will be successful or not. And this is important because as data professionals who want to build and implement successful ML projects, we need to understand how the ML life cycle works. The ML life cycle is a sensible framework to implement an ML project, as shown in Figure 1.7:

Figure 1.7 – The ML life cycle

Figure 1.7 – The ML life cycle

Let’s look at each of these in detail.

The business case

Before unleashing state-of-the-art models on any problem, it is imperative you take time to sit with stakeholders to clearly understand the business objectives or the pain points to be resolved, as without clarity, the entire process will almost definitely fail. It is always important to keep in mind that the goal of the entire process is not to test a new breakthrough model you have been itching to try out but to solve a pain point, or create value for your company.

Once we understand the problem, we can categorize the problem as either a supervised or unsupervised learning task. This phase of an ML life cycle is all about asking the right questions. We need to sit with the concerned team to determine what the key metrics that would define the project as a success are. What resources are required in terms of budget, manpower, compute, and the project timeline? Do we have the domain understanding or do we need an expert’s input into defining and understanding the underlying factors and goals that will define the project’s success? These are some of the questions we should ask as data professionals before we embark on a project.

For the exam, we will need to understand the requirements of each question before we tackle them. We will discuss a lot more about the exam before we conclude this chapter.

Data gathering and understanding

When all the requirements are detailed, the next step is to collect the data required for the project. In this phase, we would first determine what type of data we will collect and where we will collect it from. Before we embark on anything, we need to ask ourselves whether the data is relevant – for example, if we collect historical car data from 1980, would we be able to predict the price of a car in 2022? Would data be made available by stakeholders, or would we be collecting it from a database, Internet of Things (IoT) devices, or via web scraping? Would there be any need for the collection of secondary data for the task at hand? Also, we would need to establish whether the data will be collected all at once or whether it will be a continuous process of data collection. Once we have collected the data needed for the project, we would then examine the data to get an understanding of it.

Next, we would examine the data to see whether the data collected is in the right format. For example, if you collect car sales data from multiple sources, one source may calculate a car’s mileage in kilometers per hour and another source could use miles per hour. Also, there could be missing values in some of the features, and we might also encounter duplicates, outliers, and irrelevant features in the data we collected. During this phase, we would carry out data exploration to gain insights into the data, and data preprocessing to handle various issues such as formatting problems, missing values, duplicates, removal of irrelevant features, and handling outliers, imbalanced data, and categorical features.

Modeling

Now that we have a good understanding of the business needs, we have decided on the type of ML problem that we will address, and we also have good-quality data after completing our preprocessing step. We will split our data into a training split and keep a small subset of the test to evaluate the model’s performance. We will train our model to understand the relationship between the features and the target variable using our training set. For example, we could train our fraud detection model on historical data provided by the bank and test it out with our hold out (test set) to evaluate our model’s performance before deploying it for use. We go through an iterative process of fine-tuning our model hyperparameters until we arrive at our optimal model.

Defining whether the modeling process is a success or not is tied to the business objective, since achieving a high accuracy of 90 percent would still leave room for a 10 percent error, which could be decisive in high-stake domains such as healthcare. Imagine you deploy a model for early-stage cancer detection with an accuracy of 90 percent, which means the model would likely fail once for every 10 people; in 100 tries, it could fail about 10 times, and it could misclassify someone with cancer as healthy. This could lead to the individual not only failing to seek medical advice but also to an untimely demise. Your company could get sued and the blame would fall in your lap. To avoid situations like this, we need to understand what metrics are important for our project and what we should be less strict with. It is also important to address factors such as class imbalance, model interpretability, and ethical implications.

There are various metrics that are used to evaluate a model, and the type of evaluation depends on the type of problem we will handle. We will discuss regression metrics in Chapter 3, Linear Regression with TensorFlow, and classification metrics in Chapter 4, Classification with TensorFlow,.

Error analysis

We are not ready for deployment yet. Remember the 10 percent data that could tank our project? We will address that here. We perform an error analysis to identify the misclassified labels to identify why the model missed them. Do we have enough representative samples of these misclassified labels in our training data? We would have to determine whether we need to collect more data to capture these cases where the model failed. Can we generate synthetic data to capture the misclassified labels? Or was the misclassified data down to the wrong labeling?

Wrongly labeled data can hamper the performance of a model, as it will learn incorrect relationships between the features and target, resulting in poor performance on unseen data, making the model unreliable and the entire process a waste of resources and time. Once we resolve these questions and ensure accurate labels, we need to retrain and reevaluate our model. These steps are continuous until the business objective is achieved, and then we can proceed to deploy our model.

Model deployment and monitoring

After resolving the issues identified in the error analysis step, we can now deploy our model to production. There are various methods of deployment available. We could deploy our model as a web service, on the cloud, or on edge devices. Model deployment can be challenging as well as exciting because the entire point of building and training a model is to allow end users to apply it to solve a pain point. Once we deploy our model, we also monitor the model to ensure that the overall objectives of the business are continually achieved, and even the best-performing models can begin to underperform over time due to concept drift and data drift. Hence, after deploying our model, we cannot retire to some island. We need to continuously monitor our model and retrain the model when needed in order to ensure it continues to perform optimally.

We have now gone through the full length of the ML life cycle at a high level. Of course, there is a lot more that we can talk about in greater depth, but this is out of the scope of this exam. Hence, we will now switch our focus to looking at a number of exciting use cases where ML can be applied.