Designing a poor human concurrent executor
We will start writing our own parallel executor. This executor will have no external dependencies, will be very light, and will be able to work on multiple core computers. We will also supply a version for clusters. There is no interprocess communication mechanism other than a shared filesystem. We will make a similar data analysis as in the previous recipe, but now in a real concurrent environment.
Getting ready
You should have read and understood the previous recipe. You will at least need to get the HapMap data and run the pickle part from it.
The code for this can be found in the 08_Advanced/Multiprocessing.ipynb
notebook. Also, there is an external file called get_maf.py
, which is available next to this notebook.
How to do it...
Take a look at the following steps:
Let's start with some boilerplate code, loading the largest chromosome position from the previous recipe, and defining the window size at 2 Mbp, as shown in the following code:
from __future__...