Let's first go right back to the very beginning and consider why we might choose to write a parallel program in the first place.
The simple answer, of course, is that we want to speed up our algorithm and want to compute the answer much faster than we can do simply by running in serial, in which only a single thread of program execution can be utilized.
In this day and age of big data, we will extend this view to cover the otherwise incomputable, where the resources of a single machine architecture make it intractable to compute a complex algorithm across a massive scale of data; therefore, we have to employ thousands upon thousands of computational cores, terabytes of memory, petabytes of storage, and a supporting management infrastructure that can cope with the inevitable runtime failure of individual components during the aggregate lifetime of the computation of potentially millions of hours.
Another approach to utilizing parallelization, and arguably its...