Hey, Nvidia developers:
I Have two versions (one cuda , one OpenCL) of one kernel. I develop my kernel first in cuda and then port it to OpenCL, which proves a more efficient way to develop. Interestingly, I found the cuda keyword
“__syncthreads()” can be compiled in OpenCL kernel without error.
As I need to run my OpenCL kernel on different platforms , I get a build error of kernel on other platforms. It is understandable that probably you port your cuda compiler to OpenCL compiler, but OpenCL compiler should not be so cuda-like, as it hinder the portability.
I found this bug on drivers with version 304.88, 310.14.