I get the following error when using memory debugger when testing Thrust::sort_by_key or CUB radix sort. DeviceRadixSortDownsweepKernel is used in CUB as well as thrust.
I have tested with CUDA 8.0.44 and 8.0.61 on Pascal Titan X and can reproduce the error bellow.
However when using CUDA 7.5 and Maxwell Titan X there is no error.
CUDA context created : 13be844c3e0 CUDA module loaded: 13be85c1b00 radixSortThrust.cu Internal debugger error occurred while attempting to launch _ZN6thrust6system4cuda6detail4cub_30DeviceRadixSortDownsweepKernelINS3_23DeviceRadixSortDispatchILb0EjjiE21PtxAltDownsweepPolicyELb0EjjiEEvPT1_S9_PT2_SB_PT3_SC_iibbNS3_13GridEvenShareISC_EE in CUcontext 0x13be844c3e0, CUmodule 0x13be85c1b00: code patching failed for unknown reason. All breakpoints for function _ZN6thrust6system4cuda6detail4cub_30DeviceRadixSortDownsweepKernelINS3_23DeviceRadixSortDispatchILb0EjjiE21PtxAltDownsweepPolicyELb0EjjiEEvPT1_S9_PT2_SB_PT3_SC_iibbNS3_13GridEvenShareISC_EE have been removed. See Output View for additional messages of this type.
I’m not sure what the error means and the error seems to be unsure too. I tried increasing the code patching up to 32x just as a test and nothing changed.
Currently my larger code is a bit unstable, and do to the error above it is more difficult to test my code using the memory debugging feature (main system has Pascal GPU only). Currently my best option seems to find or code a different GPU sort.
Any help or suggestions for the above are would be appreciated. Also if anyone knows a good key-value GPU sort that could also be a solution.