Book Image

Healthcare Analytics Made Simple

By : Vikas (Vik) Kumar, Shameer Khader
Book Image

Healthcare Analytics Made Simple

By: Vikas (Vik) Kumar, Shameer Khader

Overview of this book

In recent years, machine learning technologies and analytics have been widely utilized across the healthcare sector. Healthcare Analytics Made Simple bridges the gap between practising doctors and data scientists. It equips the data scientists’ work with healthcare data and allows them to gain better insight from this data in order to improve healthcare outcomes. This book is a complete overview of machine learning for healthcare analytics, briefly describing the current healthcare landscape, machine learning algorithms, and Python and SQL programming languages. The step-by-step instructions teach you how to obtain real healthcare data and perform descriptive, predictive, and prescriptive analytics using popular Python packages such as pandas and scikit-learn. The latest research results in disease detection and healthcare image analysis are reviewed. By the end of this book, you will understand how to use Python for healthcare data analysis, how to import, collect, clean, and refine data from electronic health record (EHR) surveys, and how to make predictive models with this data through real-world algorithms and code examples.
Table of Contents (11 chapters)

Improving our models

Although in this chapter we have built a rudimentary model that matches the performance of academic research studies, there is certainly room for improvement. The following are some ideas for how the model can be improved, and we leave it to the reader to implement these suggestions and any other tricks or techniques the reader might know to improve performance. How high will your performance go?

First and foremost, the current training data has a large number of columns. Some sort of feature selection is almost always performed, particularly for logistic regression and random forest models. For logistic regression, common methods of performing feature selection include:

  • Using a certain number of predictors that have the highest coefficients
  • Using a certain number of predictors that have the lowest p-values
  • Using lasso regularization and removing predictors...