Problem reading from __constant memory


I’m having a vexing problem reading from __constant memory. I wrote a 3x3 median filter that works on tiled images, with the actual x/y coordinates of the upper left corner of the tiles passed via a __constant memory buffer (the idea is to be able to reduce processing effort if only a part of the image is interesting). This code will work for one run, then fail for the next (using identical inputs). After much experimentation I figured out the problem is that the OpenCL kernel will read the wrong coordinates out of the __constant buffer. I make sure the memory buffer is smaller than CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE, I only have a single __constant qualifier in my kernel arguments, and I made sure the data is copied correctly into device memory (I read it back and compare with the original data I copy into device memory). Being at my wits end, I then simply replaced __constant with __global, and my code now works correctly.

So my question(s) - is there something else beside CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE that needs to be taken into account when using __constant ? Does the constant memory buffer have to be created with CL_MEM_READ_ONLY for this to work reliably (I can’t test this easily since most of the OpenCL support code on the host was written by somebody else, and I don’t want to muck in that part of the code if I can help it) ? Or is this a possible bug in the NVidia OpenCL implementation ?

P.S.: all tests on a x64-86 Linux system using ‘OpenCL 1.0 CUDA 3.0.1’ and 195.30 drivers.

Hey I also have some problems with the __constant buffer.
I have 2 kernels, both with an __constant buffer in the parameter list and some other buffers.
I only use kernel 1.
Depending on the name or/and the order of the parameter list oft kernel 2 the values in constant buffer are readed right or wrong.
If I delete kernel 2 every thing works fine.
I think that has to be a big bug in diver.
The error happens with all current drivers(GTX 260 on Win and Linux).


Same problem here and elsewhere:…p;#entry1073729
Using __global seems to work around the issue.

Same problem here and elsewhere:…p;#entry1073729
Using __global seems to work around the issue.