Book Image

Automated Machine Learning with AutoKeras

By : Luis Sobrecueva
Book Image

Automated Machine Learning with AutoKeras

By: Luis Sobrecueva

Overview of this book

AutoKeras is an AutoML open-source software library that provides easy access to deep learning models. If you are looking to build deep learning model architectures and perform parameter tuning automatically using AutoKeras, then this book is for you. This book teaches you how to develop and use state-of-the-art AI algorithms in your projects. It begins with a high-level introduction to automated machine learning, explaining all the concepts required to get started with this machine learning approach. You will then learn how to use AutoKeras for image and text classification and regression. As you make progress, you'll discover how to use AutoKeras to perform sentiment analysis on documents. This book will also show you how to implement a custom model for topic classification with AutoKeras. Toward the end, you will explore advanced concepts of AutoKeras such as working with multi-modal data and multi-task, customizing the model with AutoModel, and visualizing experiment results using AutoKeras Extensions. By the end of this machine learning book, you will be able to confidently use AutoKeras to design your own custom machine learning models in your company.
Table of Contents (15 chapters)
1
Section 1: AutoML Fundamentals
5
Section 2: AutoKeras in Practice
11
Section 3: Advanced AutoKeras

Creating a sentiment analyzer

The model we are going to create will be a binary classifier for sentiments (1=Positive/0=Negative) from the IMDb sentiments dataset. This is a dataset for binary sentiment classification that contains a set of 25,000 sentiment labeled movie reviews for training and 25,000 for testing:

Figure 7.1 – Example of sentiment analysis being used on two samples

Figure 7.1 – Example of sentiment analysis being used on two samples

Similar to the Reuters example from Chapter 4, Image Classification and Regression Using AutoKeras, each review is encoded as a list of word indexes (integers). For convenience, words are indexed by their overall frequency in the dataset. So, for instance, the integer 3 encodes the third most frequent word in the data.

The notebook that contains the complete source code can be found at https://github.com/PacktPublishing/Automated-Machine-Learning-with-AutoKeras/blob/main/Chapter07/Chapter7_IMDB_sentiment_analysis.ipynb.

Now, let's have a look at the relevant...