The following section will make use of our code by implementing the parallel Web crawler. In this scheme, we will use a very interesting Python resource, ThreadPoolExecutor
, which is featured in the concurrent.futures
module. In the previous example, in which we analyzed parallel_fibonacci.py
, quite primitive forms of threads were used. Also, at a specific moment, we had to create and initialize more than one thread manually. In larger programs, it is very difficult to manage this kind of situation. In such case, there are mechanisms that allow a thread pool. A thread pool is nothing but a structure that keeps several threads, which are previously created, to be used in a certain process. It aims to reuse threads, thus avoiding unnecessary creation of threads—which is costly.
Basically, as mentioned in the previous chapter, we will have an algorithm that will execute some tasks in stages, and these tasks depend on each other. Here, we will...