Hi!
I’m trying to get the Radix Sort sample from Nvidia SDK running on my Intel CPU and facing a problem. I already searched the web for hours with no luck.
The application crashes with an “unknown exception at 0x775b07b6 in oclRadixSort.exe: 0xC0000029 …”. This exception gets thrown at “clFinish()” (if GPU_PROFILING is defined) or “clEnqueueReadBuffer”.
I figured out that the following lines in Radixsort.cl (scanwarp…) seem to be the problem. If I delete/comment this lines, the results are wrong but there is no more crash as well.
if (0 <= maxlevel) { sData[idx] += sData[idx - 1]; }
if (2 <= maxlevel) { sData[idx] += sData[idx - 4]; }
if (1 <= maxlevel) { sData[idx] += sData[idx - 2]; }
if (3 <= maxlevel) { sData[idx] += sData[idx - 8]; }
if (4 <= maxlevel) { sData[idx] += sData[idx -16]; }
Could that be a problem of local/global memory?
CPU: CL_DEVICE_LOCAL_MEM_TYPE = global
GPU: CL_DEVICE_LOCAL_MEM_TYPE = local
If thats the case, how could I rewrite the sample to get it running on a CPU?
Let me know if more details about my setup and code changes are required.
Any help is much appreciated.