Book Image

Machine Learning Using TensorFlow Cookbook

By : Luca Massaron, Alexia Audevart, Konrad Banachewicz
Book Image

Machine Learning Using TensorFlow Cookbook

By: Luca Massaron, Alexia Audevart, Konrad Banachewicz

Overview of this book

The independent recipes in Machine Learning Using TensorFlow Cookbook will teach you how to perform complex data computations and gain valuable insights into your data. Dive into recipes on training models, model evaluation, sentiment analysis, regression analysis, artificial neural networks, and deep learning - each using Google’s machine learning library, TensorFlow. This cookbook covers the fundamentals of the TensorFlow library, including variables, matrices, and various data sources. You’ll discover real-world implementations of Keras and TensorFlow and learn how to use estimators to train linear models and boosted trees, both for classification and regression. Explore the practical applications of a variety of deep learning architectures, such as recurrent neural networks and Transformers, and see how they can be used to solve computer vision and natural language processing (NLP) problems. With the help of this book, you will be proficient in using TensorFlow, understand deep learning from the basics, and be able to implement machine learning algorithms in real-world scenarios.
Table of Contents (15 chapters)
5
Boosted Trees
11
Reinforcement Learning with TensorFlow and TF-Agents
13
Other Books You May Enjoy
14
Index

Processing categorical data

Strings usually represent categorical data in tabular data. Each unique value in a categorical feature represents a quality that refers to the example we are examining (hence, we consider this information to be qualitative whereas numerical information is quantitative). In statistical terms, each unique value is called a level and the categorical feature is called a factor. Sometimes you can find numeric codes used as categorical (identifiers), when the qualitative information has been previously encoded into numbers, but the way to deal with them doesn't change: the information is in numeric values but it should be treated as categorical.

Since you don't know how each unique value in a categorical feature is related to every other value present in the feature (if you jump ahead and group values together or order them you are basically expressing a hypothesis you have about the data), you can treat each of them as a value in itself. Hence...