__constant memory issues

It looks like __constant memory is rather buggy in NVidia’s implementation and has issues that come up pretty randomly. Meaning it works, and after an unrelated change (i.e. added another kernel to the program that wasn’t actually used in any way) it stops working and all reads from __constant memory (apparently) return zero. Do some random changes or just wait, problem disappears. That hit me multiple times so far.

Are these issues known to NVidia? Searching on the net shows that several people seem to have related problems at the very least.

PS
I’m using stable 256.52 Linux drivers, not OpenCL 1.1 beta drivers.

http://www.khronos.org/message_boards/view…f=28&t=2727 seems to have the same issue.

Note how the problem is quite non-deterministic…

Edit: and this: http://forums.nvidia.com/index.php?showtopic=171429 and http://forums.nvidia.com/index.php?showtopic=171952

That sounds familiar to me as well. IIRC I encountered this once.

Any word from NVidia about this? Where can I properly report bugs?

http://www.khronos.org/message_boards/view…f=28&t=2727 seems to have the same issue.

Note how the problem is quite non-deterministic…

Edit: and this: http://forums.nvidia.com/index.php?showtopic=171429 and http://forums.nvidia.com/index.php?showtopic=171952

That sounds familiar to me as well. IIRC I encountered this once.

Any word from NVidia about this? Where can I properly report bugs?

I have also the zero value constant memory’s problem but I don’t know where should I submit my app so they can debug the problem.

I have also the zero value constant memory’s problem but I don’t know where should I submit my app so they can debug the problem.

If you are a registered developer, you should be able to submit bugs, after logging in to partners.nvidia.com . It is the same place as downloading the 1.1 beta.

If someone has a really simple example (always good when submitting bugs), and does not have an account, then something might be worked out.

If you are a registered developer, you should be able to submit bugs, after logging in to partners.nvidia.com . It is the same place as downloading the 1.1 beta.

If someone has a really simple example (always good when submitting bugs), and does not have an account, then something might be worked out.

I’ve submitted an application for the developer program, let’s see how that works out.

I’ve submitted an application for the developer program, let’s see how that works out.

daemonized

 I am currently fighting the same issue and I plan to submit a bug report along with a sample project as well.  The only way I have been able to work around the problem currently is to use the __global qualifier instead of __constant (this causes a major impact to performance as you would expect).  For small structures that I was having issues with, I copied them into local space, but this only works with relatively small structures and has negative performance impacts as well.  

 If you would, please post any response you get from NVIDIA in this thread.  I am VERY interested in the reply.  As a note, I have seen this exact issue with older GTX8800 cards and also with newer FX3800 and GTX480M (mobile) cards.  This is a relatively big issue--I'm surprised more people have not reported this.  I have verified I am NOT exceeding the constant memory size or the maximum number of constant references per kernel.

daemonized

 I am currently fighting the same issue and I plan to submit a bug report along with a sample project as well.  The only way I have been able to work around the problem currently is to use the __global qualifier instead of __constant (this causes a major impact to performance as you would expect).  For small structures that I was having issues with, I copied them into local space, but this only works with relatively small structures and has negative performance impacts as well.  

 If you would, please post any response you get from NVIDIA in this thread.  I am VERY interested in the reply.  As a note, I have seen this exact issue with older GTX8800 cards and also with newer FX3800 and GTX480M (mobile) cards.  This is a relatively big issue--I'm surprised more people have not reported this.  I have verified I am NOT exceeding the constant memory size or the maximum number of constant references per kernel.

Let’s see, so far I haven’t received any reply to my application. I heard in the worst case it takes a few weeks…?

I’m seeing the issue on both G80 and GT200 architecture GPUs, so yes, it doesn’t look like it is GPU-specific. I completely I agree that this is a big issue. Disabling constant memory (#define __constant __global) about triples execution time for my code. I’m not using much constant memory either - < 10 kilobytes - so constant cache size definitely isn’t an issue.

Let’s see, so far I haven’t received any reply to my application. I heard in the worst case it takes a few weeks…?

I’m seeing the issue on both G80 and GT200 architecture GPUs, so yes, it doesn’t look like it is GPU-specific. I completely I agree that this is a big issue. Disabling constant memory (#define __constant __global) about triples execution time for my code. I’m not using much constant memory either - < 10 kilobytes - so constant cache size definitely isn’t an issue.

Hi,
I am using NVidia Quadro FX 1800 and I am facing the same issue. After some experimenting and testing I found out that if in my .cl file I have used __constant keyword anywhere just once, it works. However, if you use the __constant keyword, anywhere else in the file no matter if its unrelated to the previous instance, all the constant variables will have zero values. This is highly annoying in my case where I have some 10-15 kernels in the same file all taking in one _constant float array as a parameter. The only work around I see to this is to have these kernels in seperate files! Highly inconvinient I guess!

Hi,
I am using NVidia Quadro FX 1800 and I am facing the same issue. After some experimenting and testing I found out that if in my .cl file I have used __constant keyword anywhere just once, it works. However, if you use the __constant keyword, anywhere else in the file no matter if its unrelated to the previous instance, all the constant variables will have zero values. This is highly annoying in my case where I have some 10-15 kernels in the same file all taking in one _constant float array as a parameter. The only work around I see to this is to have these kernels in seperate files! Highly inconvinient I guess!

nitish17

 Thank you for the information.  I understand that if you split the kernels into separate files, your application works.  I am wondering what happens if you have more than a single __constant reference in a given kernel (which is in its own file).  Even though you use only one per kernel, I am wondering what would be the result if you included a second copy of the constant in another 'clSetKernelArg' argument to the kernel (with a different name in the kernel itself, of course).  Even though that second copy would never be referenced in the kernel code, would you still get a correct result?  I am going to try this as well, but I do not always get the same results (there is some randomness to the error), so I am hoping you would not mind helping characterize the error more?

Thank you…

nitish17

 Thank you for the information.  I understand that if you split the kernels into separate files, your application works.  I am wondering what happens if you have more than a single __constant reference in a given kernel (which is in its own file).  Even though you use only one per kernel, I am wondering what would be the result if you included a second copy of the constant in another 'clSetKernelArg' argument to the kernel (with a different name in the kernel itself, of course).  Even though that second copy would never be referenced in the kernel code, would you still get a correct result?  I am going to try this as well, but I do not always get the same results (there is some randomness to the error), so I am hoping you would not mind helping characterize the error more?

Thank you…

I experience the same problem and submitted a bug report long ago (NVIDIA Incident Report (596613)). NVIDIA claims that the problem is fixed, but I still have to put kernels using __constant memory into seperate files.

I experience the same problem and submitted a bug report long ago (NVIDIA Incident Report (596613)). NVIDIA claims that the problem is fixed, but I still have to put kernels using __constant memory into seperate files.

My application for the developer program was accepted. I’m trying to put together a reduced test case at the moment and will submit another bug report with the test case.

stefanw, did NVidia say which driver fixed the problem? Maybe it’s fixed in the (not yet released) OpenCL 1.1 branch.
Anyway, they should really release another OpenCL 1.1 beta. :)