Book Image

Machine Learning for Healthcare Analytics Projects

Book Image

Machine Learning for Healthcare Analytics Projects

Overview of this book

Machine Learning (ML) has changed the way organizations and individuals use data to improve the efficiency of a system. ML algorithms allow strategists to deal with a variety of structured, unstructured, and semi-structured data. Machine Learning for Healthcare Analytics Projects is packed with new approaches and methodologies for creating powerful solutions for healthcare analytics. This book will teach you how to implement key machine learning algorithms and walk you through their use cases by employing a range of libraries from the Python ecosystem. You will build five end-to-end projects to evaluate the efficiency of Artificial Intelligence (AI) applications for carrying out simple-to-complex healthcare analytics tasks. With each project, you will gain new insights, which will then help you handle healthcare data efficiently. As you make your way through the book, you will use ML to detect cancer in a set of patients using support vector machines (SVMs) and k-Nearest neighbors (KNN) models. In the final chapters, you will create a deep neural network in Keras to predict the onset of diabetes in a huge dataset of patients. You will also learn how to predict heart diseases using neural networks. By the end of this book, you will have learned how to address long-standing challenges, provide specialized solutions for how to deal with them, and carry out a range of cognitive tasks in the healthcare domain.
Table of Contents (7 chapters)

What this book covers

Chapter 1, Breast Cancer Detection, will show you how to import data from the UCI repository. In this chapter, we will name the columns (or features) and put them into a pandas DataFrame. We will learn how to preprocess our data and remove the ID column. We will also explore the data so that we know more about it. We will also see how to create histograms (so that we can understand the distributions of the different features) and a scatterplot matrix (so that we can look for linear relationships between the variables). We will learn how to implement some testing parameters, build a KNN classifier and an SVC, and compare their results using a classification report. Finally, we will build our own cell and explore what it would take to actually get a malignant or benign classification.

Chapter 2, Diabetes Onset Detection, covers the building of a deep neural network in Keras. We will explore the optimal hyperparameters using the scikit-learn grid search. We will also learn how to optimize a network by tuning the hyperparameters. In this chapter, we will explore how to apply the network to predict the onset of diabetes in a huge dataset of patients.

Chapter 3, DNA Classification, will show how to predict the functional outcome—or a promoter/non-promoter —for a DNA sequence from E. coli bacteria with 96% accuracy. We will look at how to import data from a repository and how to convert textual inputs to numerical data. We will then learn to build and train classification algorithms and compare and contrast them using the classification report.

Chapter 4, Diagnosing Coronary Artery Disease, will show how to use sklearn and Keras, how to import data from a UCI repository using the pandas read_csv function, and how to preprocess that data. We will then learn how to describe the data and print out histograms so we know what we're working with, followed by executing a train/test split with the model_selection function from sklearn.

Furthermore, we will also learn how to convert one-hot encoded vectors for a categorical classification, defining simple neural networks using Keras. We will look at activation functions, such as softmax, for categorical classifications with categorical_crossentropy. We will also look at training the data and how we fit our model to our training data for both categorical and binary problems. Ultimately, we will look at how to do a classification report and an accuracy score for our results.

Chapter 5, Autism Screening with Machine Learning, will show how to predict autism in patients with approximately 90% accuracy. We will also learn how to deal with categorical data; a lot of health applications are going to have categorical data and one way to address them is by using one-hot encoded vectors. Furthermore, we will learn how to reduce overfitting using dropout regularization.