Bugs in OpenCL samples?

I think that some of your samples contain a bug in the kernel definition.
the bug is that parameters that are not pointers contain references to memory space __global. Is this supposed to be OK?
for example say :

__kernel void filter( __global const uchar4* src, __global int size)

then size contains __global which is incorrect (?).
I think the correct definition is

__kernel void filter( __global const uchar4* src, int size)

I say so because AMD OpenCL compiler fails at this point where removing this fixes the samples…

Another question is that (also I have posted this on AMD OpenCL forums but I think i don’t violate any terms):

I don’t know if that it’s supposed to be supported by OpenCL language but

some samples that work from NVIDIA OpenCL SDK doesn’t work with your SDK mainly is because they use something like:

__kernel void filter( __global const uchar4* src, __global unsigned int* dest,

__local uchar4 local[32][32])

The problem lies in __local uchar4 local[32][32] your SDK fails to compile…

A solution I have found to fix their examples for your SDK is to change two things:

  1. __local uchar4 local[32][32] by __local *local in kernel definiton

and

  1. References to the array

change local[y] with local[y+(32)*x]

Are you going to fix this or are NVIDIA guys using non standard features?

Seems that nvidia is using non standard features as

a reply of an AMD enginner (?):

OpenCL allows for the static allocation of local array sizes only within a kernel and not as arguments. This may change in the future and indeed we may choose to implement this but it is not a supported feature, today, either in the specification or our implementation.

As you have pointed out according to the specification the standard way of passing local arugments is through arguments of the form:

__local T * name

This is defined in the OpenCL 1.0 specification as:

"6.5.2 __local (or local)

The __local or local address space name is used to describe variables that need to be allocated in local memory and are shared by all work-items of a work-group. This qualifier can be used with arguments to functions (including __kernel functions) declared as pointers, or with variables declared inside a __kernel function."

Note the sentance that begins with “This qualifier can be used with arguments to functions (including __kernel functions) declared as pointers,”

Thanks for reporting this! You’re correct, there is some incorrect syntax in our SDK samples. This should be fixed in the next release.

Hello,

As I’ve written on the corresponding AMD thread (http://forums.amd.com/forum/messageview.cfm?catid=390&threadid=117558&forumid=9), I am not convinced (yet ;-)) that this is an erroneous syntax.

If it were, I guess it would mean that to access a local float4[w][h][d] array one would typically need a cast or a hell of a headache, which does not make much sense.

One of the arguments advanced to justify this syntax’s supposed incorrectness is that “OpenCL allows for the static allocation of local array sizes only within a kernel and not as arguments”.

However there is no static allocation taking place inside the function here : the syntax “void f(int x[64])” does not allocate an int[64] inside the function f. It just types the pointer x as being a pointer to an int[64] array that must be allocated by the caller, the outside array is indeed not passed by value (not to be confused with non-argument variable declaration).

As the local arguments size must be declared through the API clSetKernelArg by hand anyway, can’t see much of an issue supporting this syntax here… It’s just type lipstick.

Regards

Olivier