I think that some of your samples contain a bug in the kernel definition.
the bug is that parameters that are not pointers contain references to memory space __global. Is this supposed to be OK?
for example say :
__kernel void filter( __global const uchar4* src, __global int size)
then size contains __global which is incorrect (?).
I think the correct definition is
__kernel void filter( __global const uchar4* src, int size)
I say so because AMD OpenCL compiler fails at this point where removing this fixes the samples…
Another question is that (also I have posted this on AMD OpenCL forums but I think i don’t violate any terms):
I don’t know if that it’s supposed to be supported by OpenCL language but
some samples that work from NVIDIA OpenCL SDK doesn’t work with your SDK mainly is because they use something like:
__kernel void filter( __global const uchar4* src, __global unsigned int* dest,
__local uchar4 local)
The problem lies in __local uchar4 local your SDK fails to compile…
A solution I have found to fix their examples for your SDK is to change two things:
- __local uchar4 local by __local *local in kernel definiton
- References to the array
change local[y] with local[y+(32)*x]
Are you going to fix this or are NVIDIA guys using non standard features?