Book Image

Bioinformatics with Python Cookbook - Second Edition

By : Tiago Antao
Book Image

Bioinformatics with Python Cookbook - Second Edition

By: Tiago Antao

Overview of this book

Bioinformatics is an active research field that uses a range of simple-to-advanced computations to extract valuable information from biological data. This book covers next-generation sequencing, genomics, metagenomics, population genetics, phylogenetics, and proteomics. You'll learn modern programming techniques to analyze large amounts of biological data. With the help of real-world examples, you'll convert, analyze, and visualize datasets using various Python tools and libraries. This book will help you get a better understanding of working with a Galaxy server, which is the most widely used bioinformatics web-based pipeline system. This updated edition also includes advanced next-generation sequencing filtering techniques. You'll also explore topics such as SNP discovery using statistical approaches under high-performance computing frameworks such as Dask and Spark. By the end of this book, you'll be able to use and implement modern programming techniques and frameworks to deal with the ever-increasing deluge of bioinformatics data.
Table of Contents (16 chapters)
Title Page
About Packt
Contributors
Preface
Index

Introducing the Genepop format


The Genepop format is used in many conservation genetics studies. It's the format of the Genepop application and is the de facto format for much population genetics analysis. If you come from other fields (for example, those that have a lot of sequencing experience), you may not have heard of it, but this format is widely used (as its citation record proves) and is worth a look. Here, we will convert some datasets from previous recipes to this format and introduce the Genepop parser in Biopython.

Getting ready

You will need to run the previous recipe because its output is required for this one. I have a small library to help with basic data conversion and charting. You can find this code at https://github.com/tiagoantao/pygenomics, and you can install it with pip:

pip install pygenomics

Note that at this stage, we will not use the Genepop application (this will change in the next recipe), so no need to install it for now.

As usual, this is available in the Chapter04...