Profiling is an important tool, which must be used for tuning any high performance application. OpenCL provides this mechanism by making the cl_event
objects to hold the timing information. This timing information can be captured using the clGetEventProfilingInfo
function. The command_queue
queue should be created with CL_QUEUE_PROFILING_ENABLE
flag set as properties argument in clCreateCommandQueue
.
If the queue is enabled for profiling then the following function returns profiling information for the enqueued task associated with the event
object:
cl_int clGetEventProfilingInfo (cl_event event,cl_profiling_info param_name,size_t param_value_size,void *param_value,size_t *param_value_size_ret)
All the timestamps CL_PROFILING_COMMAND_[QUEUED|SUBMIT|START|END]
can be obtained using this function. The returned value is a 64 bit cl_ulong
value, which specifies the device time counter in nanoseconds. You can determine the time of when the command got enqueued|submitted|started|ends...