PDA

View Full Version : OpenCL: "Hello World" adapted from Apple code [Updated Sep 04 2011]



Petr Schreiber
07-02-2010, 20:06
Hi,

after brief pause here comes next journey to the realm of OpenCL.
This time it is adaptation of Apple "OpenCL Hello World" code.

The current situation with drivers on OpenCL is quite terrifying, even this official example didn't run here, and I had to do few tweaks to make it work. For this reason I am very curious about your experience with the code.

Lesson learned - avoid at any cost situation, where your application should rely on technique described in specification as "implementation defined".

Another funny thing which occurs with this example is that it has some precision problems, for 3 digits after dot ok, but the rest is usually slightly imprecise. Maybe it is caused by fact thinBASIC used EXT precision during calculations.

Important note: The code requires the OpenCL headers (http://www.thinbasic.com/community/showthread.php?10159-OpenCL-Headers-Updated-Sep-04-2011).

Have fun,
Petr

P.S. Should be the Apple notice kept in this modified version? I do not understand clearly what they are saying there.

Charles Pegge
08-02-2010, 00:21
Hi Petr,

I get correct 1009/1024 computed 1009/1024

kryton9
08-02-2010, 06:53
Petr, did you join the Apple world as Mike Hartlef and my Nephew have?

Petr Schreiber
08-02-2010, 08:54
Charles,

thank you, I get the same result 1009/1024.
When you look at the numbers they are not that much off topic, but still...

Hehe Kent, no no, here you can still hear about people with iPhone from legends rather than seeing somebody actually have it :)
I just desperately seek for working OpenCL code, but so far any code I pick, I always have to amend it.
Did it ran on your nvidia? It did for me under 196.21, but not sure about older drivers.


Petr

kryton9
08-02-2010, 23:07
I get an error since I don't have the opencl.dll. I am doing my coding on my old gateway notebook, and that alas has an ATI Radeon Mobility.
I need to read up on all these new opengl thingy's... opencl, opengl es to see what is what and if I need to download any of them.

Charles Pegge
08-02-2010, 23:34
OpenCL is not quite the paragon of elegance we might hope for - far too many parameters to deal with. I suppose they are trying to cover all the options that might occur but it makes it hard for early adopters.

Petr Schreiber
09-02-2010, 00:12
Hi Kent,

GeForce 8xxx/9xxx/2xx+ is required, or Radeon HD 4xxx/5xxx series, both with the latest drivers.

It is slightly confusing technology at the time, mostly due to little fragile implementations currently, but I found it quite exciting to explore and it is indeed very fast once you understand the basic rules.

It seems with ATi you can use both CPU and GPU as devices, with Nvidia it seems only GPUs are supported, as there is no Nvidia CPU :)

I am curious if Intel will enter this area... I hope it won't end as their promised OpenGL 3.x support.


Petr

kryton9
09-02-2010, 08:15
I think all of these guys in the cutting edge see the big paradigm shift coming. Multi-Core everywhere and on all sorts of devices. The tegra 2 is a dual core with incredible battery life. Look at the left side bar from this link, it gives you an idea where we are and will be computing instead of the desktop.
http://www.nvidia.com/object/tegra_250.html

I can see why OpenCL is here now, just not sure which direction to go right now... just sort of sitting back and watching on a daily basis at the big changes coming this year.

Petr Schreiber
09-02-2010, 10:37
Hi Kent,

yes, very nice device. But once you touch the GPGPU with tens or even hundreds of little dedicated compute devices ... 2 cores is not enough :lol:

I updated the code.

kryton9
09-02-2010, 22:57
What I am saying Petr is that this is the trend. The dual core tegra 2, is 2 arm processors and a gpu able to run unreal engine 3. Next year we might see the Tegra 4, quad core.. and so on. People are liking not being tethered to their desks and the new input that the processors are now able to process, multi-touch and tilt and acceleration sensors make using the devices very easy and intuitive.

I am seeing it with my family members who stumbled with the mouse and never really did anything with their computers besides surf the web and check email. They never used their webcams, they couldn't put their photos into attachments etc. But now with these new devices they are able to do all of the above and are actually enjoying and thrilled by it.

Anyways I am glad you took on OpenCL as I think it will be very very useful soon everywhere.

Charles Pegge
10-02-2010, 08:50
Hi Petr,

Your 15 incorrect results are almost certainly caused by the SINGLE precision multiply in the GPU.

A 32 bit float has 1 sign bit 8 exponent bits and that leaves 23 bits for the significand
but this may be further affected by the GPU rounding behaviour.

Info on floating point formats:

http://en.wikipedia.org/wiki/Single_precision_floating-point_format

Charles

PS: I have your program working inside Oxygen with your OpenCL headers and very few alterations :)
Will post soon.

Charles Pegge
10-02-2010, 10:31
Hi Petr,

I am now getting 100% correct results by assigning data()^2 and results() to SINGLE variables then comparing them.
So I exonerate the GPU from any rounding anomalies. :)

Oxygen Code:



' Validate our results

correct = 0
single r,d
'
For i = 1 To count
r=results(i)
d=dat(i) * dat(i)
If r=d Then
correct += 1
Else
'PrintL Round(results(i), 6), Format$(Round(dat(i) * data(i), 6))
if len(s)<1000 then s+=d tab r crlf
End If
Next

Petr Schreiber
10-02-2010, 11:20
Hi Charles,

very interesting, thanks for the info :occasion:
I tried to mod the thinBASIC code to the approach you proposed, but I still get mismatch.
Could you check the code from the first post if I ported it correctly?

I am looking forward to Oxygen version!


Petr

Charles Pegge
10-02-2010, 13:46
Hi Petr,

I confirm there is still a precision matching problem in thinBasic and I get the same result as before.

I've just posted the Oxygen OpenCL_HelloWorld with the latest Oxygen which now supports
BYVAL 0
BYREF ANY
Also fixed a few other problems relating to BYVAL quads.

Support for EXT variables requires further work but I have redefined EXT as DOUBLE in the main source code, without altering your header files in any way.


So you will need both Oxygen and OpenCL zips.

http://community.thinbasic.com/index.php?topic=2517

Petr Schreiber
10-02-2010, 18:17
Thanks Charles!

worked perfectly. Only problem is that in case error occurs (try to change kernel name to some nonsense), it GPFs in Oxygen. But maybe it is caused by fact the Oxygen version does not stop on an error, which will be easy to fix.


Petr

Charles Pegge
10-02-2010, 18:30
Glad it worked on your system Petr.

Stopping on an error: my favourite use of GOTO :)

kryton9
13-02-2010, 10:54
Hi Petr,

I get correct 1009/1024 computed 1009/1024



I got the same result tonight Petr.

Charles, yours ran fine too and I got 1024/1024.

Charles Pegge
21-02-2010, 07:45
You may be interested in following this thread on the FreeBasic Forum.



Any NVIDIA OpenCL CUDA or AMD/ATI Stream coder here ?
D J Peters
http://www.freebasic.net/forum/viewtopic.php?t=15103