Hi,
as you probably know, OpenCL is new standard for parallel computing. It is still in baby stage (specification ready, but implementations appear slowly). Unlike CUDA or STREAM, it is not vendor specific technology, and at the moment you can run it on NVIDIA, ATi and S3 hardware.
OpenCL allows you to execute C-like code on GPU, automatically parallelized. That means you can get very brutal speed boost for your time critical parts of code, without need to go assembler route.
It seems ideal for any vector operations, image processing and other heavy tasks.
If you own GeForce 8xxx and up or Radeon HD 4xxx and up with OpenCL enabled drivers, you can try attached example code on summing vectors. Older cards do not support this, and probably never will.
I provide adaptation of code from NVIDIA OpenCL Jump Start Guide. This guide is not bad introduction to OpenCL, except it contains quite a few typos and mistakes. The code I provide should be working adaptation of "OpenGL Host Code" and performs vector addition.
Code could be further optimized, but as-is it could give you idea how OpenCL coding works.
If you have the hardware, I would appreciate any input from your side (works/doesn't work, problems with headers, ...)
Important note: The code requires the OpenCL headers.
Petr
Bookmarks