Hi there,
I have an issue with slow bandwidth in OpenCL on my SnowLeopard OS X. First I thought it was NVIDIAs fault something with the driver etc. So I decided to compile it with just the OpenCL framework provided by apple taking the files from NVIDIAs sdk and building the application just with g++ *.cpp -framework OpenCL -o oclBandwidthTest
I get this rather interesting and also annoying result which seem to say that there is a problem on how OS X treats my GPU card (GTX-275 1792MB) but I do not believe this:
./oclBandwidthTest64 Starting…
WARNING: NVIDIA OpenCL platform not found - defaulting to first platform!
Running on…
GeForce GTX 275
Quick Mode
Host to Device Bandwidth, 1 Device(s), Paged memory, direct access
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 2243.4
Device to Host Bandwidth, 1 Device(s), Paged memory, direct access
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 2343.1
Device to Device Bandwidth, 1 Device(s)
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 6134.7
PASSED
Press to Quit…
Since I am interested in making my machine workable with GPGPU it is necessary that the first two approach 4GB/sec and the last to exceed 100GB/sec
So I am attaching two files the first is 64bit (oclBandwidthTest64) the second 32bit(oclBandwidthTest32) so as for you to test it and post the results here suggesting also, if you can, what I should do in order to fix this situation.
I do not believe there is an issue with OpenCL or my OS X since in MatrixMultiplication I get 220ms execution time opposed to 120ms in Linux. This means that the card functions as it should. If the memory bandwidth was 6GB/sec then it would be deadly slow. I will try to tweak the oclBandwidthTest file to see the actual time execution. There might be a problem just on how you folks in NVidia implemented the test and it shows this peculiar result in OS X.
Alex.