I Have two versions (one cuda , one OpenCL) of one kernel. I develop my kernel first in cuda and then port it to OpenCL, which proves a more efficient way to develop. Interestingly, I found the cuda keyword
“__syncthreads()” can be compiled in OpenCL kernel without error.
As I need to run my OpenCL kernel on different platforms , I get a build error of kernel on other platforms. It is understandable that probably you port your cuda compiler to OpenCL compiler, but OpenCL compiler should not be so cuda-like, as it hinder the portability.
I found this bug on drivers with version 304.88, 310.14.
Best!
Thanks Biaowang,
The nsight team just forwarded this post to me - I’ll move it to CUDA Programming and Performance, that will get the right team looking at this and I am sure they’ll respond pretty quickly.
BTW - you should consider signing up as a CUDA Registered Developer, and then you can submit bugs formally - in addition to getting access to all the latest releases, even pre-release content.
Dear nadeemm:
Thanks for reminding me that I can report bug directly. Actually I have been already a CUDA registered Developer, but I didn’t conscious this until you remind me.
Best