Book Image

Bioinformatics with Python Cookbook

By : Tiago R Antao, Tiago Antao
Book Image

Bioinformatics with Python Cookbook

By: Tiago R Antao, Tiago Antao

Overview of this book

Table of Contents (16 chapters)
Bioinformatics with Python Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Thinking with generators


Writing generator functions is quite easy, but more importantly, they allow you to write different dialects of code that are more expressive and easier to change. Here, we will compute the GC skew of the first 1000 records of a FASTQ file with and without generators discussed in the preceding recipe. We will then change the code to add a filter (the median nucleotide quality has to be 40 or higher). This allows you to see the extra code writing style that generators allow you in the presence code changes.

Getting ready

You should get the data as in the previous recipe, but in this case, you only need the first file called SRR003265_1.filt.fastq.gz.

As usual, this is available in the 08_Advanced/Generators.ipynb notebook.

How to do it...

Take a look at the following steps:

  1. Let's start with the required import code:

    from __future__ import division, print_function
    import gzip
    import numpy as np
    from Bio import SeqIO, SeqUtils
    from Bio.Alphabet import IUPAC
  2. Then, print the mean...