Book Image

Bioinformatics with Python Cookbook

By : Tiago R Antao, Tiago Antao
Book Image

Bioinformatics with Python Cookbook

By: Tiago R Antao, Tiago Antao

Overview of this book

Table of Contents (16 chapters)
Bioinformatics with Python Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Implementing a basic PDB parser


As you know, by now the Bio.PDB parser is not complete. Here, we will develop a framework that allows you to parse other records on PDB files. Although we can expect a migration from PDB to the mmCIF format in the future, this is still useful in many situations.

Getting ready

In order to parse a format, we need its specification. You can find this at http://www.wwpdb.org/documentation/file-format.php. We will mostly be concerned with secondary structure records (HELIX and SHEET), but you will find more records in your scaffold parser. You can extend this scaffold to other records that you may need.

You can find this content in the 06_Prot/Parser.ipynb notebook.

How to do it...

Take a look at the following steps:

  1. First, let's retrieve a file to work with. We will only retrieve, not parse as follows:

    from __future__ import print_function
    from Bio import PDB
    repository = PDB.PDBList()
    repository.retrieve_pdb_file('1TUP', pdir='.')
  2. We will now devise a basic parsing framework...