I got a running a Meanshift Clustering code which is (on a Mac with Leopard LLVM) slower on the GPU then on the CPU.
I hoped to fix the problem with some cache optimizations using clGetDeviceInfo with the parameter CL_DEVICE_GLOBAL_CACHE_MEM_SIZE,
but unfortunately the function returns 0bytes for GPU cache.
On the CPU the optimization makes sense, because the function delivers the existing 2MB of cache of the CPU.
Don’t GPUs have a cache or is it just another missing (not yet implemented) feature?
As far as I know they have, even if it is just some kilobytes, which would help!