In this recipe, we will perform an initial exploratory analysis of one of our generated datasets.
We will analyze the 10 percent sampling of chromosome 2 without the offspring. We will look for monomorphic loci (in this case, SNPs) across populations, along with how to research minimum allele frequencies and expected heterozygosites.
You will need to have run the previous two recipes and should have the hapmap10_auto_noofs_2.gp
and hapmap10_auto_noofs_2.pops
files downloaded. We will also use the metadata file that we downloaded in the first recipe. For this code to work, you will need to install Genepop from either http://kimura.univ-montp2.fr/~rousset/Genepop.htm or, if you're using Anaconda Python, by using conda install -c bioconda genepop
. We will use the interface provided by Biopython to execute Genepop and parse its output files.
There is a Notebook file with this recipe, called Chapter04/Exploratory_Analysis.ipynb
, but it will still...