The OpenCL Memory model guarantees a relaxed memory consistency between devices. This means that different work items may see a different view of global memory as the computation progresses. This leads to a bigger challenge for the developers to partition data and splitting computation tasks into different work items. Synchronization is required to ensure data consistency within the work items of a work group. One needs to make sure that the data the work item is accessing is always correct. This makes the application developers task a little complicated to write applications with relaxed consistency, and hence explicit synchronization mechanisms are required.
The x86/x86_64 CPU cache coherent architecture is different from the OpenCL relaxed memory architecture. In cache coherent systems, data that resides in the local processor caches is guaranteed to be consistent across processors. The programmer need not worry about the data partitioning in cache coherent architectures....