In the ongoing process of optimization of an OpenCL program we need to find performance bottlenecks at each step so that we can improve on them. Here are some techniques for this investigation. In a Unix based system the time
command provides user time, system time, and CPU time of a program-execution in detail. In Windows PowerShell, we use a built-in command called Measure-Command
that gives total running time of a program. This is also similar to the linux time
command. To get the execution time of a function or any part of code in C
we can use either the clock_t clock (void);
function or the time_t time (time_t* timer);
and double difftime (time_t end, time_t beginning);
functions from the standard header <time.h>
or <ctime>
. Those including several other techniques are good enough for measuring time of a CPU based program.
In OpenCL optimization, our area of interest is a bit different. As a part of optimization of the entire program...