The use of high throughput sequencing has turbocharged metagenomics from a field focused on studying variation in single sequences such as the 16S ribosomal RNA (rRNA) sequence to studying entire genomes of the many species that may be present in a sample. The task of identifying species or taxa and their abundances in a sample is computationally challenging and requires the bioinformatician to deal with the preparation of sequences, assignment to taxa, comparisons of taxa, and quantifications. Packages for this have been developed by a wide range of specialist laboratories that have created new tools and new visualizations specific to working with sequences in metagenomics.
In this chapter, we'll look at recipes to carry out some complex analyses in metagenomics with R:
- Loading in hierarchical taxonomic data using phyloseq
- Rarefying counts to correct for sample...