Book Image

Deep Learning for Genomics

By : Upendra Kumar Devisetty

Book Image

Deep Learning for Genomics

By: Upendra Kumar Devisetty

Overview of this book

Deep learning has shown remarkable promise in the field of genomics; however, there is a lack of a skilled deep learning workforce in this discipline. This book will help researchers and data scientists to stand out from the rest of the crowd and solve real-world problems in genomics by developing the necessary skill set. Starting with an introduction to the essential concepts, this book highlights the power of deep learning in handling big data in genomics. First, you’ll learn about conventional genomics analysis, then transition to state-of-the-art machine learning-based genomics applications, and finally dive into deep learning approaches for genomics. The book covers all of the important deep learning algorithms commonly used by the research community and goes into the details of what they are, how they work, and their practical applications in genomics. The book dedicates an entire section to operationalizing deep learning models, which will provide the necessary hands-on tutorials for researchers and any deep learning practitioners to build, tune, interpret, deploy, evaluate, and monitor deep learning models from genomics big data sets. By the end of this book, you’ll have learned about the challenges, best practices, and pitfalls of deep learning for genomics.

Preface

Who is this book for?

What this book covers

To get the most out of this book

Download the example code files

Conventions used

Share Your Thoughts

Download a free PDF copy of this book

Part 1 – Machine Learning in Genomics

Part 1 – Machine Learning in Genomics

Free Chapter

Chapter 1: Introducing Machine Learning for Genomics

Chapter 1: Introducing Machine Learning for Genomics

What is machine learning?

Why machine learning for genomics?

Machine learning for genomics in life sciences and biotechnology

Chapter 2: Genomics Data Analysis

Chapter 2: Genomics Data Analysis

Technical requirements

What is a genome?

Genome sequencing

Analysis of genomic data

Introduction to Biopython for genomic data analysis

Chapter 3: Machine Learning Methods for Genomic Applications

Chapter 3: Machine Learning Methods for Genomic Applications

Technical requirements

Genomics big data

Supervised and unsupervised ML

ML for genomics

An ML use case for genomics – Disease prediction

ML challenges in genomics

Part 2 – Deep Learning for Genomic Applications

Part 2 – Deep Learning for Genomic Applications

Chapter 4: Deep Learning for Genomics

Chapter 4: Deep Learning for Genomics

Understanding what deep learning is and how it works

Anatomy of deep neural networks

DNNs for genomics

Introducing deep learning algorithms and Python libraries

Chapter 5: Introducing Convolutional Neural Networks for Genomics

Chapter 5: Introducing Convolutional Neural Networks for Genomics

Introduction to CNNs

CNNs for genomics

Applications of CNNs in genomics

Chapter 6: Recurrent Neural Networks in Genomics

Chapter 6: Recurrent Neural Networks in Genomics

Introducing RNNs

Different RNN architectures

Applications and use cases of RNNs in genomics

Chapter 7: Unsupervised Deep Learning with Autoencoders

Chapter 7: Unsupervised Deep Learning with Autoencoders

What is unsupervised DL?

Types of unsupervised DL

What are autoencoders?

Autoencoders for genomics

Chapter 8: GANs for Improving Models in Genomics

Chapter 8: GANs for Improving Models in Genomics

Challenges working with genomics datasets

How can GANs help improve models?

Practical applications of GANs in genomics

Part 3 – Operationalizing models

Part 3 – Operationalizing models

Chapter 9: Building and Tuning Deep Learning Models

Chapter 9: Building and Tuning Deep Learning Models

Technical requirements

Data processing

Developing models

Tuning the models

Use case – Predicting the binding site location of the JunD TF

Chapter 10: Model Interpretability in Genomics

Chapter 10: Model Interpretability in Genomics

What is model interpretability?

Unlocking business value from model interpretability

Model interpretability methods in genomics

Use case – Model interpretability for genomics

Chapter 11: Model Deployment and Monitoring

Chapter 11: Model Deployment and Monitoring

Technical requirements

Introducing model deployment

Monitoring models using advanced tools

Chapter 12: Challenges, Pitfalls, and Best Practices for Deep Learning in Genomics

Chapter 12: Challenges, Pitfalls, and Best Practices for Deep Learning in Genomics

Deep learning challenges regarding genomics

Common pitfalls for applying deep learning to genomics

Best practices for applying deep learning to genomics

Index

Other Books You May Enjoy

Other Books You May Enjoy

Packt is searching for authors like you

Share Your Thoughts

Download a free PDF copy of this book

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

What is machine learning?

Before we talk about ML, let’s understand what AI is. In the simplest terms, AI is the ability of a machine to mimic human intelligence and iteratively improve itself based on the information it collects. The goal of AI is to build systems to perform actions that are routinely done by humans such as problem-solving, pattern matching, image recognition, knowledge acquisition, and so on. ML, a subset of AI, is the process of training a model to learn and improve from experience. Deep learning (DL), in turn, is a subfield of ML, in which we leverage artificial neural networks (ANNs) to mimic the human brain and find the nonlinear relationships between the input and output to generate predictions (Figure 1.1):

Figure 1.1 – AI versus ML versus DL – how they are related

Figure 1.1 – AI versus ML versus DL – how they are related

In ML, a model is built based on input data and an underlying algorithm to make useful predictions from real-world data. In a simplified ML, “features” that represent an individual measurable property of the data are provided as input, and “labels” are returned as the predictions. Suppose we want to predict whether a particular sequence of DNA has a binding site for a transcription factor (TF) of your interest or not. Using the traditional approach, we would use a positional weight matrix (PWF) to scan the sequence and identify the potential motifs that are overrepresented. Even though this works, this is extremely difficult, manual, scalable, and so on. Using an ML-based approach, we would give an ML model plenty of DNA sequences until the ML model learns the mathematical relationship between the features from those DNA sequences that either have or don’t have binding sites (labels) based on experimental results. It then uses this knowledge to make decisions on new data and make informed predictions. For example, we could give the ML model an unknown DNA sequence, and it would predict the correct binding site motif if present. This is one such example of why ML is a good fit for genomics problems. Some other ways in which ML can be used in genomics include identifying genetic disorders, predicting the type of cancer from genetic variants, improving disease prognosis, and so on.