Index
A
- Accelerated Parallel Processing (APP) / Installation steps
- address space qualifiers
- about / Address space qualifiers
- __global/global address space / __global/global address space
- __local/local address space / __local/local address space
- __constant/constant address space / __constant/constant address space
- __private/private address space / __private/private address space
- restrictions / Restrictions
- algorithm
- OpenCL kernel code / OpenCL Kernel Code
- host code / The Host Code
- aligned attribute / Data type attributes
- AMD
- Llano architecture / Advanced Micro Devices, Inc. (AMD)
- about / Advanced Micro Devices, Inc. (AMD)
- GCN compute unit / Advanced Micro Devices, Inc. (AMD)
- graphics cards / Advanced Micro Devices, Inc. (AMD)
- AMD graphics card
- used, for OpenCL installation on Linux system / Installing OpenCL on a Linux system with an AMD graphics card
- used, for OpenCL installation on Windows system / Installing OpenCL on a Windows system with an AMD graphics card
- AMD Radeon HD 7870 / AMD Radeon HD 7870 Graphics Processor
- AMD Trinity APU / AMD A10 5800K APUs
- Apple OSX
- using, for OpenCL installation / Apple OSX
- application
- scaling / Application scaling
- architecture
- strategies / General tips
- arg_index / Setting kernel arguments
- arg_indx / Querying kernel argument
- arg_size / Setting kernel arguments
- arg_value / Setting kernel arguments
- Arithmetic operators
- about / Operators
- Arithmetic unary operators / Operators
- ARM
- Mali T6XX / ARM Mali GPUs
- Mali T628 graphics / ARM Mali GPUs
- Mali T628 / ARM Mali GPUs
B
- barrier function / OpenCL Kernel Code
- basic data types
- binaries / Creating and building program objects
- binary file
- creating / Creating binary files
- used, for SAXPY / SAXPY using the binary file
- binary_status / Creating and building program objects
- Bitonic sort
- about / Bitonic sort
- Bits Per Pixel (bpp) / Image representation
- blocking_map parameter / Mapping buffer objects
- Blocking_read / Blocking_read and Blocking_write
- Blocking_write / Blocking_read and Blocking_write
- blocking_write/blocking_read / Reading and writing buffers
- blocking_write parameter / Reading and writing buffers
- buffer
- about / Image details descriptor cl_image_desc
- reading / Reading and writing buffers
- writing / Reading and writing buffers
- mapping / Mapping of a buffer
- creating, from GL texture / Creating a buffer from GL texture
- buffer objects
- mapping / Mapping buffer objects
- querying / Querying buffer objects
- buffer parameter / Creating subbuffer objects, Reading and writing buffers, Mapping buffer objects
- buffers
- reading / Reading and writing buffers
- writing / Reading and writing buffers
- Blocking_read / Blocking_read and Blocking_write
- Blocking_write / Blocking_read and Blocking_write
- rectangular reads / Rectangular or cuboidal reads
- cuboidal reads / Rectangular or cuboidal reads
- copying / Copying buffers
- buffer_create_info parameter / Creating subbuffer objects
- buffer_create_type parameter / Creating subbuffer objects
- built-in data types
- basic data types / Basic data types and vector types
- vector types / Basic data types and vector types
- half data type / The half data type
- reserved data type / Reserved data types
- alignment / Alignment of data types
- vector data types / Vector data types
- vector components / Vector components
- built-in functions
- about / Built-in functions
- work item function / Work item function
- synchronization / Synchronization and memory fence functions
- memory fence functions / Synchronization and memory fence functions
- built-in kernels / Built-in kernels
C
- case study
- matrix multiplication / Case study – matrix multiplication
- histogram calculation / Case study – Histogram calculation
- clBuildProgram function / Creating and building program objects, Creating binary files, Offline and online compilation, SAXPY using the binary file, Alignment of data types
- clCreateBuffer function / Creating images
- clCreateCommandQueue function / NDRange
- clCreateEventFromGLsyncKHR command / Synchronization
- clCreateImage function / Reading and writing buffers, Image histogram computation
- clCreateKernel function / Creating kernel objects
- clCreateKernelsInProgram function / Creating kernel objects
- clCreateProgramWithBinary function / Offline and online compilation
- clCreateProgramWithBuiltInKernel function / Built-in kernels
- clCreateProgramWithSource function / Creating and building program objects, Querying kernel argument
- clCreateSampler function / Samplers
- clCreateUserEvent function / User-created events
- clEnqueueBarrierWithWaitList function / Coarse-grained synchronization, Event-based or fine-grained synchronization
- clEnqueueCopyImage function / Copying and filling images
- clEnqueueFillImage function / Copying and filling images
- clEnqueueMapBuffer function / Mapping buffer objects
- clEnqueueMapImage function / Mapping image objects
- clEnqueueMarkerWithWaitList function / Event-based or fine-grained synchronization
- clEnqueueNDRange function / Execution model
- clEnqueueNDRangeKernel function / NDRange, Explaining the code
- clEnqueueReadBuffer function / Reading and writing buffers, Copying buffers, Mapping buffer objects
- clEnqueueReadImage function / Reading and writing buffers
- clEnqueueReleaseGLObjects() function / Listing Interoperation steps
- clEnqueueTask function / Executing the kernels
- clEnqueueWriteImage function / Reading and writing buffers
- clFinish() function / Listing Interoperation steps
- clFinish function / The Host Code, Coarse-grained synchronization
- clGet*Info function / Getting information about cl_event
- clGetDeviceInfo function / Image histogram computation, __local/local address space
- clGetEventInfo function / Getting information about cl_event
- clGetEventProfilingInfo function / Event profiling
- clGetImageInfo function / Querying image objects
- clGetKernelArgInfo function / Querying kernel argument
- clGetKernelInfo function / Variable attribute
- clGetMemObjectInfo function / Querying image objects
- clGetPlatformIDs / Query platforms
- clGetPlatformIDs( ) command / Initializing OpenCL context for OpenGL Interoperation
- clGetProgramBuildInfo function / Creating and building program objects
- clGetProgramInfo function / Creating and building program objects, Querying program objects
- CLK_ADDRESS_CLAMP / Samplers
- CLK_ADDRESS_CLAMP_TO_EDGE / Samplers
- CLK_ADDRESS_MIRRORED_REPEAT / Samplers
- CLK_ADDRESS_NONE / Samplers
- CLK_ADDRESS_REPEAT / Samplers
- CLK_FILTER_LINEAR / Samplers
- CLK_FILTER_NEAREST / Samplers
- CLK_GLOBAL_MEM_FENCE / Memory fences
- CLK_LOCAL_MEM_FENCE / Memory fences
- CLK_NORMALIZED_COORDS_FALSE
- about / Samplers
- CLK_NORMALIZED_COORDS_TRUE
- about / Samplers
- clLinkProgram function / Creating binary files
- clReleaseCommandQueue function / Coarse-grained synchronization
- clReleaseMemObject function / User-created events
- clReleaseProgram function / Releasing program and kernel objects
- clRetainEvent function / Getting information about cl_event
- clSetKernelArg function / SAXPY using the binary file, Setting kernel arguments, __constant/constant address space, Case study – Histogram calculation
- clWaitForEvents function / Coarse-grained synchronization, Event-based or fine-grained synchronization
- CL_COMMAND_USER command / Getting information about cl_event
- CL_COMPLETE / Event-based or fine-grained synchronization
- cl_event object / Finding the performance of your program?
- CL_EVENT_COMMAND_EXECUTION_STATUS / Getting information about cl_event
- CL_EVENT_COMMAND_QUEUE / Getting information about cl_event
- CL_EVENT_COMMAND_TYPE / Getting information about cl_event
- CL_EVENT_CONTEXT / Getting information about cl_event
- CL_EVENT_REFERENCE_COUNT / Getting information about cl_event
- CL_IMAGE_ARRAY_SIZE / Querying image objects
- CL_IMAGE_BUFFER / Querying image objects
- CL_IMAGE_DEPTH / Querying image objects
- cl_image_desc structure / Image details descriptor cl_image_desc
- CL_IMAGE_ELEMENT_SIZE / Querying image objects
- CL_IMAGE_FORMAT / Querying image objects
- cl_image_format image format descriptor / Image format descriptor cl_image_format
- CL_IMAGE_HEIGHT / Querying image objects
- CL_IMAGE_ROW_PITCH / Querying image objects
- CL_IMAGE_SLICE_PITCH / Querying image objects
- CL_IMAGE_WIDTH / Querying image objects
- cl_kernel object / SAXPY using the binary file, Querying kernel objects
- CL_KERNEL_ARG_ACCESS_QUALIFIER / Querying kernel argument
- CL_KERNEL_ARG_ADDRESS_QUALIFIER / Querying kernel argument
- CL_KERNEL_ARG_NAME / Querying kernel argument
- CL_KERNEL_ARG_TYPE_NAME / Querying kernel argument
- CL_KERNEL_ARG_TYPE_QUALIFIER / Querying kernel argument
- CL_KERNEL_ATTRIBUTES / Querying kernel objects
- CL_KERNEL_CONTEXT / Querying kernel objects
- CL_KERNEL_FUNCTION_NAME / Querying kernel objects
- CL_KERNEL_GLOBAL_WORK_SIZE / Querying kernel argument
- CL_KERNEL_LOCAL_MEM_SIZE / Querying kernel argument
- CL_KERNEL_NUM_ARGS / Querying kernel objects
- CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE / Querying kernel argument
- CL_KERNEL_PRIVATE_MEM_SIZE / Querying kernel argument
- CL_KERNEL_PROGRAM / Querying kernel objects
- CL_KERNEL_REFERENCE_COUNT / Querying kernel objects
- CL_KERNEL_WORK_GROUP_SIZE / Querying kernel argument
- cl_mem buffer object / Creating subbuffer objects
- cl_mem object / Memory objects, Reading and writing buffers
- CL_MEM_ALLOC_HOST_PTR / Memory objects
- CL_MEM_ASSOCIATED_MEMOBJECT / Querying buffer objects
- CL_MEM_CONTEXT / Querying buffer objects
- CL_MEM_COPY_HOST_PTR / Memory objects
- CL_MEM_FLAGS / Querying buffer objects
- CL_MEM_HOST_NO_ACCESS / Memory objects
- CL_MEM_HOST_PTR / Querying buffer objects
- CL_MEM_HOST_READ_ONLY / Memory objects
- CL_MEM_HOST_WRITE_ONLY / Memory objects
- CL_MEM_MAP_COUNT / Querying buffer objects
- CL_MEM_OFFSET / Querying buffer objects
- CL_MEM_READ_ONLY / Memory objects
- CL_MEM_READ_WRITE / Memory objects
- CL_MEM_REFERENCE_COUNT / Querying buffer objects
- CL_MEM_SIZE / Querying buffer objects
- CL_MEM_TYPE / Querying buffer objects
- CL_MEM_USE_HOST_PTR / Memory objects
- CL_MEM_WRITE_ONLY / Memory objects
- cl_program object / Creating program objects, Built-in kernels, Alignment of data types
- CL_PROGRAM_BINARIES / Querying program objects
- CL_PROGRAM_BINARY_SIZES / Querying program objects
- CL_PROGRAM_BINARY_TYPE / Creating and building program objects
- CL_PROGRAM_BUILD_LOG / Creating and building program objects
- CL_PROGRAM_BUILD_OPTIONS / Creating and building program objects
- CL_PROGRAM_BUILD_STATUS / Creating and building program objects
- CL_PROGRAM_CONTEXT / Querying program objects
- CL_PROGRAM_DEVICES / Querying program objects
- CL_PROGRAM_KERNEL_NAMES / Querying program objects
- CL_PROGRAM_NUM_DEVICES / Querying program objects
- CL_PROGRAM_NUM_KERNELS / Querying program objects
- CL_PROGRAM_REFERENCE_COUNT / Querying program objects
- CL_PROGRAM_SOURCE / Querying program objects
- CL_QUEUED / Event-based or fine-grained synchronization
- CL_RUNNING / Event-based or fine-grained synchronization
- CL_SUBMITTED / Event-based or fine-grained synchronization
- cl_ulong variable / Event profiling
- coalesced memory access
- about / Kernel optimization techniques
- coarse grained synchronization
- about / Coarse-grained synchronization
- code
- about / Explaining the code
- command synchronization / OpenCL events and monitoring these events
- command_queue command / Executing the kernels
- command_queue object / NDRange, Reading and writing buffers
- command_queue parameter / Reading and writing buffers
- Compute Engines (CU) / Advanced Micro Devices, Inc. (AMD)
- computer architecture
- constant memory / Constant memory
- context / Creating and building program objects, Built-in kernels
- context parameter / Memory objects
- convert* function / Explicit conversion
- count / Creating and building program objects
- cuboidal reads / Rectangular or cuboidal reads
- CUDA / CUDA
D
- data type attributes
- about / Data type attributes
- aligned attribute / Data type attributes
- packed attribute / Data type attributes
- data types
- reinterpreting / Reinterpreting data types
- DCT coefficient
- about / Encoding JPEG
- quantization / Encoding JPEG
- device / Creating and building program objects, Querying kernel argument
- devices / OpenCL context
- device_list / Creating and building program objects, Built-in kernels
- DHT (Define Huffman Table) / Encoding JPEG
- distanceF function / k-Nearest Neighborhood (k-NN) algorithm
- dst_origin parameter / Copying and filling images
E
- endiantype attribute / Variable attribute
- EOI (End of Image) / Encoding JPEG
- errcode_ret / OpenCL context, Creating and building program objects, Creating kernel objects
- errcode_ret parameter / Memory objects, Creating subbuffer objects, Mapping buffer objects
- errorcode_ret / Creating and building program objects
- event / NDRange, Getting information about cl_event
- event based synchronization
- event object / Event profiling
- event parameter / Reading and writing buffers, Mapping buffer objects
- event profiling
- about / Event profiling
- event_wait_list object / NDRange
- event_wait_list parameter / Reading and writing buffers, Mapping buffer objects
- Execution model
- about / Execution model
- NDRange / NDRange
- work-item / NDRange
- work-group / NDRange
- global-id / NDRange
- local-id / NDRange
- OpenCL context / OpenCL context
- OpenCL command queue / OpenCL command queue
- execution model / Run on a different device
- Execution Units (EU) / Intel®
- Execution Units (EUs) / Intel® IVY bridge
- Explicit conversion
- about / Explicit conversion
- extensionString variable / Detecting if OpenCL-OpenGL Interoperation is supported
F
- fence object / Synchronization
- fill_color parameter / Copying and filling images
- filter variable / __constant/constant address space
- fine grained synchronization
- first in first out (FIFO) / OpenCL command queue
- flags parameter / Memory objects, Creating subbuffer objects, Creating images
- float variable / Alignment of data types
- function attributes / Function attributes
- Fused Multiply Add (FMA) / Basic data types and vector types
G
- Gaussian filter / Gaussian filter
- glBindBuffer( ) / Mapping of a buffer
- glBufferData( ) / Mapping of a buffer
- glFenceSync( ) function / Synchronization
- glFinish() function / Listing Interoperation steps
- glGenBuffers( ) / Mapping of a buffer
- global-id / NDRange
- global memory / Global memory
- global_work_offset / Executing the kernels
- global_work_offset object / NDRange
- global_work_size / Executing the kernels
- global_work_size function / NDRange
- global_work_size object / NDRange
- GL texture
- buffer, creating from / Creating a buffer from GL texture
- Graphics Core Next (GCN) / Advanced Micro Devices, Inc. (AMD)
- Graphics Processing Clusters (GPC) / NVIDIA®
H
- half data type / The half data type
- operating on / Operation on half data type
- histogram
- about / Histogram calculation
- algorithm / Algorithm
- histogram calculation / Case study – Histogram calculation
- host code / The Host Code
- host notification / OpenCL events and monitoring these events
- host_ptr / Creating images
- host_ptr parameter / Memory objects
- Huffman coding
- quantization / Encoding JPEG
- hybrid parallel computing model / Hybrid parallel computing model
I
- ICD (Installable Client Driver) / Installation steps
- image access qualifiers
- about / Image access qualifiers
- function attributes / Function attributes
- data type attributes / Data type attributes
- variable attribute / Variable attribute
- image buffers
- passing, to kernels / Passing image buffers to kernels
- image filters
- implementing / Implementing image filters
- mean filter / Mean filter
- median filter / Median filter
- Gaussian filter / Gaussian filter
- Sobel filter / Sobel filter
- image histogram
- computing / Image histogram computation
- image object
- about / Reading and writing buffers
- mapping / Mapping image objects
- querying / Querying image objects
- images
- creating / Creating images
- cl_image_format image format descriptor / Image format descriptor cl_image_format
- cl_image_desc structure / Image details descriptor cl_image_desc
- image buffers, passing to kernels / Passing image buffers to kernels
- copying / Copying and filling images
- filling / Copying and filling images
- representing / Image representation
- PBM (Portable Bit Map) / Image representation
- Bits Per Pixel (bpp) / Image representation
- PGM (Portable Gray Map) / Image representation
- PPM (Portable Pixel Map) / Image representation
- image_array_size / Image details descriptor cl_image_desc
- image_depth / Image details descriptor cl_image_desc
- image_height / Image details descriptor cl_image_desc
- image_row_pitch / Image details descriptor cl_image_desc
- image_slice_pitch / Image details descriptor cl_image_desc
- image_type / Image details descriptor cl_image_desc
- image_width / Image details descriptor cl_image_desc
- Implicit conversion
- about / Implicit conversion
- Instruction Set Architecture (ISA) / Basic data types and vector types
- Intel
- about / Intel®
- INTEL IVY bridge / Intel® IVY bridge
- Intermediate Language (IL) / Tools for profiling and finding performance bottlenecks
- Interoperation
- about / Defining Interoperation
- implementing / Implementing Interoperation
- OpenCL-OpenGL Interoperation support, detecting / Detecting if OpenCL-OpenGL Interoperation is supported
- OpenCL context, initializing for OpenGL Interoperation / Initializing OpenCL context for OpenGL Interoperation
- buffer, mapping / Mapping of a buffer
- steps, listing / Listing Interoperation steps
- synchronization / Synchronization
- buffer, creating from GL texture / Creating a buffer from GL texture
- Renderbuffer object / Renderbuffer object
- intptr_t data type / Other data types
- is_less function / Reinterpreting data types
J
- JPEG
- about / JPEG compression
- encoding / Encoding JPEG
- JPEG compression
- about / JPEG compression
- OpenCL implementation / OpenCL implementation
- JPEG encoding
- about / JPEG compression, Encoding JPEG
- run length encoding / Encoding JPEG
- Huffman coding / Encoding JPEG
K
- K-Nearest Neighborhood (k-NN) algorithm
- kernel / Setting kernel arguments, Executing the kernels, Querying kernel objects, Querying kernel argument, Simple kernel
- kernel argument
- setting / Setting kernel arguments
- querying / Querying kernel argument
- kernel object / NDRange
- kernel objects
- creating / Creating kernel objects
- kernel argument, setting / Setting kernel arguments
- kernels, executing / Executing the kernels
- querying / Querying kernel objects
- kernel argument, querying / Querying kernel argument
- releasing / Releasing program and kernel objects
- program, releasing / Releasing program and kernel objects
- built-in kernels / Built-in kernels
- kernel optimization techniques
- about / Kernel optimization techniques
- kernels / Creating kernel objects
- image buffers, passing to / Passing image buffers to kernels
- executing / Executing the kernels
- kernel_name / Creating kernel objects
- kernel_names / Built-in kernels
L
- least square curve fitting
- about / Regression with least square curve fitting
- linear approximation / Linear approximations
- parabolic approximation / Parabolic approximations
- implementing / Implementation
- lengths / Creating and building program objects
- linear approximation / Linear approximations
- local-id / NDRange
- local memory / Local memory
- local_work_size / Executing the kernels
- local_work_size object / NDRange
- LOG_OCL_ERROR utility / The Host Code
M
- main() function / Sequential implementation
- malloc function / Creating and building program objects
- matrix multiplication
- sequential implementation / Sequential implementation
- OpenCL implementation / OpenCL implementation
- kernel / Simple kernel
- kernel optimization techniques / Kernel optimization techniques
- MCU (Minimum Coded Unit) / Encoding JPEG
- Mean and Gaussian filter / Mean and Gaussian filter
- mean filter / Mean filter
- Median filter / Median filter
- median filter / Median filter
- memory fence functions / Synchronization and memory fence functions
- memory fences
- about / Memory fences
- memory model / Run on a different device
- Memory model
- about / Memory model
- global memory / Global memory
- constant memory / Constant memory
- local memory / Local memory
- private memory / Private memory
- memory objects
- about / Memory objects
- MPI / MPI
- multiple devices
- and different OpenCL contexts / Multiple devices and different OpenCL contexts
- and single OpenCL context / Multiple devices and single OpenCL context
N
- NDRange / NDRange
- num_devices / OpenCL context, Creating and building program objects, Built-in kernels
- num_events object / Event-based or fine-grained synchronization
- num_events_in_wait_list parameter / Reading and writing buffers, Mapping buffer objects
- num_kernels / Creating kernel objects
- num_kernels_ret / Creating kernel objects
- NVIDIA
- NVIDIA graphics card
- used, for OpenCL installation on Linux system / Installing OpenCL on a Linux system with an NVIDIA graphics card
- used, for OpenCL installation on Windows system / Installing OpenCL on a Windows system with an NVIDIA graphics card
- NVIDIA GTX 680 / NVIDIA® GeForce® GTC 680 GPU
O
- offline compilation / Offline and online compilation
- offset parameter / Reading and writing buffers, Mapping buffer objects
- online compilation / Offline and online compilation
- OpenACC / OpenACC
- OpenCL
- about / CUDA or OpenCL?, Introduction to OpenCL
- goal / Introduction to OpenCL
- hardware vendors / Hardware and software vendors
- components / OpenCL components
- installation, steps / Installation steps
- SAXPY routine, implementing / Implement the SAXPY routine in OpenCL
- implementing / OpenCL implementation
- using / Finding the scope of the use of OpenCL
- filter implementation / OpenCL implementation of filters
- OpenCL-OpenGL Interoperation support
- OpenCL command queue / OpenCL command queue
- OpenCL context / OpenCL context
- initializing, for OpenGL Interoperation / Initializing OpenCL context for OpenGL Interoperation
- OpenCL event
- about / OpenCL events and monitoring these events
- monitoring / OpenCL events and monitoring these events
- synchronization models / OpenCL event synchronization models
- OpenCLfilter implementation
- Mean and Gaussian filter / Mean and Gaussian filter
- Median filter / Median filter
- Sobel filter / Sobel filter
- OpenCL ICD
- about / OpenCL ICD, What is an OpenCL ICD?
- OpenCL installation
- on Linux system, with AMD graphics card / Installing OpenCL on a Linux system with an AMD graphics card
- on Linux system, with NVIDIA graphics card / Installing OpenCL on a Linux system with an NVIDIA graphics card, Installing OpenCL on a Windows system with an AMD graphics card
- on Windows system, with NVIDIA graphics card / Installing OpenCL on a Windows system with an NVIDIA graphics card
- Apple OSX / Apple OSX
- multiple installations / Multiple installations
- OpenCL kernel code / OpenCL Kernel Code
- OpenCL program
- software requirements / Basic software requirements
- compliant computer, installing / Installing and setting up an OpenCL compliant computer
- compliant computer, setting up / Installing and setting up an OpenCL compliant computer
- OpenCL program building / OpenCL program building options
- OpenCLStruct function / Alignment of data types
- OpenGL
- about / Introduction to OpenGL
- OpenGL Interoperation
- OpenCL context, initializing for / Initializing OpenCL context for OpenGL Interoperation
- OpenMP / OpenMP
- operators
- about / Operators
- half data type, operating on / Operation on half data type
- options / Creating and building program objects
- origin / Reading and writing buffers
P
- packed attribute / Data type attributes
- parabolic approximation / Parabolic approximations
- parallel programming techniques
- about / Different parallel programming techniques
- OpenMP / OpenMP
- MPI / MPI
- OpenACC / OpenACC
- CUDA / CUDA, CUDA or OpenCL?
- OpenCL / CUDA or OpenCL?
- Renderscripts / Renderscripts
- hybrid parallel computing model / Hybrid parallel computing model
- param_name / Creating and building program objects, Querying program objects, Querying kernel objects, Querying kernel argument, Getting information about cl_event
- param_value / Creating and building program objects, Querying program objects, Querying kernel objects
- param_value_size / Creating and building program objects, Querying program objects, Querying kernel objects
- param_value_size_ret / Creating and building program objects, Querying program objects, Querying kernel objects
- PBM (Portable Bit Map) / Image representation
- performance
- finding, of program / Finding the performance of your program?
- finding, tools used / Tools for profiling and finding performance bottlenecks
- advantages / Kernel optimization techniques
- performance-bottleneck
- finding, tools used / Tools for profiling and finding performance bottlenecks
- pfn_notify / OpenCL context, Creating and building program objects
- PGM (Portable Gray Map) / Image representation
- platform model / Run on a different device
- Platform model
- about / Platform model
- AMD Trinity APU / AMD A10 5800K APUs
- AMD Radeon HD 7870 / AMD Radeon HD 7870 Graphics Processor
- NVIDIA GTX 680 / NVIDIA® GeForce® GTC 680 GPU
- INTEL IVY bridge / Intel® IVY bridge
- Platform versions
- about / Platform versions
- Query Platforms / Query platforms
- Query devices / Query devices
- PPM (Portable Pixel Map) / Image representation
- PrintDeviceInfo() function / Query devices
- private memory / Private memory
- profiling
- program / Creating and building program objects, Querying program objects, Creating kernel objects
- releasing / Releasing program and kernel objects
- performance, finding / Finding the performance of your program?
- programming model / Run on a different device
- program objects
- creating / Creating program objects, Creating and building program objects
- building / Creating and building program objects
- OpenCL program building / OpenCL program building options
- querying / Querying program objects
- binary file, creating / Creating binary files
- offline compilation / Offline and online compilation
- online compilation / Offline and online compilation
- SAXPY, binary file used / SAXPY using the binary file
- SPIR / SPIR – Standard Portable Intermediate Representation
- properties / OpenCL context
- ptr / Reading and writing buffers
- ptrdiff_t data type / Other data types
- ptr parameter / Reading and writing buffers
Q
- Query devices / Query devices
- Query Platforms / Query platforms
R
- read_imageui function / Image histogram computation
- rectangular reads / Rectangular or cuboidal reads
- region / Reading and writing buffers
- regression
- with least square curve fitting / Regression with least square curve fitting
- Renderbuffer object / Renderbuffer object
- Renderscripts / Renderscripts
- reserved data type / Reserved data types
- restrictions / Restrictions
- row_pitch object / Reading and writing buffers
- rules
- aliasing / Aliasing rules
S
- samplers
- about / Samplers
- SAXPY
- about / Implement the SAXPY routine in OpenCL
- binary file, using / SAXPY using the binary file
- SAXPY routine
- implementing, in OpenCL / Implement the SAXPY routine in OpenCL
- SAXPY routine implementations, in OpenCL
- about / Implement the SAXPY routine in OpenCL
- OpenCL code / OpenCL code
- OpenCL program flow / OpenCL program flow
- kernel, runnin gon CPU / Run on a different device
- platform model / Run on a different device
- memory model / Run on a different device
- execution model / Run on a different device
- programming model / Run on a different device
- saxpy_kernel function / Execution model, SAXPY using the binary file
- SDK
- for NVIDIA, URL / Tools for profiling and finding performance bottlenecks
- for AMD, URL / Tools for profiling and finding performance bottlenecks
- sequential implementation / Sequential implementation
- single device
- and out-of-order queue / Single device and out-of-order queue
- single device in-order usage / Single device in-order usage
- sizeof() operator / Other data types
- size parameter / Memory objects, Reading and writing buffers, Mapping buffer objects, Creating images
- size_t data type / Other data types
- size_t get_global_id (uint dimindx) function / Work item function
- size_t get_global_offset (uint dimindx) function / Work item function
- size_t get_global_size (uint dimindx) function / Work item function
- size_t get_group_id (uint dimindx) function / Work item function
- size_t get_local_id (uint dimindx) function / Work item function
- size_t get_local_size (uint dimindx) function / Work item function
- size_t get_num_groups (uint dimindx) function / Work item function
- slice_pitch object / Reading and writing buffers
- Sobel filter / Sobel filter, Sobel filter
- Software Development Kits (SDK) / Installing and setting up an OpenCL compliant computer
- software requirements, OpenCL program
- about / Basic software requirements
- Windows / Windows
- Linux / Linux
- SOS (Start of Scan) / Encoding JPEG
- SPIR / SPIR – Standard Portable Intermediate Representation
- src_origin parameter / Copying and filling images
- Start of Image (SO) / Encoding JPEG
- storage class specifiers
- about / Storage class specifiers
- Streaming Multiprocessors-X (SMX) / NVIDIA® GeForce® GTC 680 GPU
- strings / Creating and building program objects
- subbuffer objects
- creating / Creating subbuffer objects
- synchronization / Synchronization and memory fence functions
- synchronization, Interoperation / Synchronization
- synchronization models
- single device in-order usage / Single device in-order usage
- single device and out-of-order queue / Single device and out-of-order queue
- multiple devices and different OpenCL contexts / Multiple devices and different OpenCL contexts
- multiple devices and single OpenCL context / Multiple devices and single OpenCL context
T
- time command / Finding the performance of your program?
- tools
- used, for finding performance / Tools for profiling and finding performance bottlenecks
- used, for finding performance-bottleneck / Tools for profiling and finding performance bottlenecks
U
- uint get_work_dim () function / Work item function
- uintptr_t data type / Other data types
- user created events
- about / User-created events
- user_data / OpenCL context, Creating and building program objects
V
- variable attribute / Variable attribute
- vector components / Vector components
- vector data types / Vector data types
- vector types / Basic data types and vector types
- VECTOR_SIZE variable / OpenCL code
- vendor
- strategies / General tips
- vload_half function / Operation on half data type
W
- wglGetCurrentContext() function / Initializing OpenCL context for OpenGL Interoperation
- wglGetCurrentDC() function / Initializing OpenCL context for OpenGL Interoperation
- work-group / NDRange
- work-item / NDRange
- work item function / Work item function
- work_dim / Executing the kernels
- work_dim object / NDRange