Book Image

R Bioinformatics Cookbook - Second Edition

By : Dan MacLean
Book Image

R Bioinformatics Cookbook - Second Edition

By: Dan MacLean

Overview of this book

The updated second edition of R Bioinformatics Cookbook takes a recipe-based approach to show you how to conduct practical research and analysis in computational biology with R. You’ll learn how to create a useful and modular R working environment, along with loading, cleaning, and analyzing data using the most up-to-date Bioconductor, ggplot2, and tidyverse tools. This book will walk you through the Bioconductor tools necessary for you to understand and carry out protocols in RNA-seq and ChIP-seq, phylogenetics, genomics, gene search, gene annotation, statistical analysis, and sequence analysis. As you advance, you'll find out how to use Quarto to create data-rich reports, presentations, and websites, as well as get a clear understanding of how machine learning techniques can be applied in the bioinformatics domain. The concluding chapters will help you develop proficiency in key skills, such as gene annotation analysis and functional programming in purrr and base R. Finally, you'll discover how to use the latest AI tools, including ChatGPT, to generate, edit, and understand R code and draft workflows for complex analyses. By the end of this book, you'll have gained a solid understanding of the skills and techniques needed to become a bioinformatics specialist and efficiently work with large and complex bioinformatics datasets.
Table of Contents (16 chapters)

Using bioconda to install external tools

Using environments for packages in R is extremely useful for managing different versions for different projects. Bioinformatics pipelines often have numerous dependencies outside of R, including binary command-line programs and packages from other languages, and it can often be useful to put those under the same sort of management. Tools for that do exist.

Anaconda is a distribution of Python and R that is particularly popular for scientific computing, data science, and machine learning. It includes a large number of pre-installed packages and tools, such as NumPy, SciPy, and Jupyter, making it easy to get started with those technologies. Anaconda also includes the conda package manager, which can be used to install additional packages and manage environments. Bioconda is a distribution of bioinformatics software built on top of conda. It includes a wide variety of bioinformatics tools and libraries, making it easy to install and manage those...