Book Image

R Bioinformatics Cookbook - Second Edition

By : Dan MacLean
Book Image

R Bioinformatics Cookbook - Second Edition

By: Dan MacLean

Overview of this book

The updated second edition of R Bioinformatics Cookbook takes a recipe-based approach to show you how to conduct practical research and analysis in computational biology with R. You’ll learn how to create a useful and modular R working environment, along with loading, cleaning, and analyzing data using the most up-to-date Bioconductor, ggplot2, and tidyverse tools. This book will walk you through the Bioconductor tools necessary for you to understand and carry out protocols in RNA-seq and ChIP-seq, phylogenetics, genomics, gene search, gene annotation, statistical analysis, and sequence analysis. As you advance, you'll find out how to use Quarto to create data-rich reports, presentations, and websites, as well as get a clear understanding of how machine learning techniques can be applied in the bioinformatics domain. The concluding chapters will help you develop proficiency in key skills, such as gene annotation analysis and functional programming in purrr and base R. Finally, you'll discover how to use the latest AI tools, including ChatGPT, to generate, edit, and understand R code and draft workflows for complex analyses. By the end of this book, you'll have gained a solid understanding of the skills and techniques needed to become a bioinformatics specialist and efficiently work with large and complex bioinformatics datasets.
Table of Contents (16 chapters)

Finding enriched KEGG pathways

Kyoto Encyclopedia of Genes and Genomes (KEGG) is a bioinformatics resource that integrates genomic, chemical, and systemic functional information. KEGG contains a comprehensive database of molecular networks that represent various biological systems, such as metabolic pathways, regulatory pathways, and signaling pathways.

KEGG is widely used in bioinformatics analysis to understand the relationships between genes, proteins, and other biomolecules in biological systems. It provides a wealth of information on the molecular mechanisms of various biological processes, such as metabolism, signal transduction, and disease pathways. Researchers can use KEGG to analyze their own datasets, such as gene expression data or protein-protein interaction data, and identify the key pathways that are affected by their experiments.

In this recipe, we’ll look at how to examine a gene list derived from experiments in order to find pathways using the free-to...