Book Image

Hands-On Machine Learning with IBM Watson

By : James D. Miller
Book Image

Hands-On Machine Learning with IBM Watson

By: James D. Miller

Overview of this book

IBM Cloud is a collection of cloud computing services for data analytics using machine learning and artificial intelligence (AI). This book is a complete guide to help you become well versed with machine learning on the IBM Cloud using Python. Hands-On Machine Learning with IBM Watson starts with supervised and unsupervised machine learning concepts, in addition to providing you with an overview of IBM Cloud and Watson Machine Learning. You'll gain insights into running various techniques, such as K-means clustering, K-nearest neighbor (KNN), and time series prediction in IBM Cloud with real-world examples. The book will then help you delve into creating a Spark pipeline in Watson Studio. You will also be guided through deep learning and neural network principles on the IBM Cloud using TensorFlow. With the help of NLP techniques, you can then brush up on building a chatbot. In later chapters, you will cover three powerful case studies, including the facial expression classification platform, the automated classification of lithofacies, and the multi-biometric identity authentication platform, helping you to become well versed with these methodologies. By the end of this book, you will be ready to build efficient machine learning solutions on the IBM Cloud and draw insights from the data at hand using real-world examples.
Table of Contents (15 chapters)
Free Chapter
1
Section 1: Introduction and Foundation
6
Section 2: Tools and Ingredients for Machine Learning in IBM Cloud
10
Section 3: Real-Life Complete Case Studies

Data cleansing and preparation

A common description for data cleansing and preparation is the work that goes into transforming raw data into a form that data scientists and analysts can more easily run through machine learning algorithms in an effort to uncover insights or make predictions based upon that data.

This process can be complicated by issues such as missing or incomplete records or simply finding extraneous columns of information within a data source.

In the previous example screenshot, we can see that the DataFrame object includes the columns country, description, designation, points, price, province, and so on.

As an exercise designed to demonstrate how easily we can use Python within Watson Studio to prepare data, let's suppose that we wanted to drop one or more columns from the DataFrame. To accomplish this task, we use the following Python statements:

to_drop...