Radix Sort sample on CPU crashes

Hi!

I’m trying to get the Radix Sort sample from Nvidia SDK running on my Intel CPU and facing a problem. I already searched the web for hours with no luck.

The application crashes with an “unknown exception at 0x775b07b6 in oclRadixSort.exe: 0xC0000029 …”. This exception gets thrown at “clFinish()” (if GPU_PROFILING is defined) or “clEnqueueReadBuffer”.

I figured out that the following lines in Radixsort.cl (scanwarp…) seem to be the problem. If I delete/comment this lines, the results are wrong but there is no more crash as well.

if (0 <= maxlevel) { sData[idx] += sData[idx - 1]; }

if (2 <= maxlevel) { sData[idx] += sData[idx - 4]; }

if (1 <= maxlevel) { sData[idx] += sData[idx - 2]; }

if (3 <= maxlevel) { sData[idx] += sData[idx - 8]; }

if (4 <= maxlevel) { sData[idx] += sData[idx -16]; }

Could that be a problem of local/global memory?

CPU: CL_DEVICE_LOCAL_MEM_TYPE = global

GPU: CL_DEVICE_LOCAL_MEM_TYPE = local

If thats the case, how could I rewrite the sample to get it running on a CPU?

Let me know if more details about my setup and code changes are required.

Any help is much appreciated.