I’m trying to debug a program for a research project that uses Cuda Dynamic Parallelism. My program was running without errors in debug (and no memcheck) and non-debug modes fine. However, when I set cuda memcheck on in cuda-gdb, the program failed with the following error:
Error: Failed to get warp state (dev=0, sm=0, wp=1), error=CUDBG_ERROR_UNKNOWN_FUNCTION(0x3)
After spending quite a while trying to figure this out, I decided to try integrated memcheck with cuda-gdb on some sample programs provided.
I observed exactly the same error on 6_Advanced/cdpLUDecomposition.
The error was not observed on 6_Advanced/cdpBezierTessellation, 6_Advanced/cdpAdvancedQuicksort or 6_Advanced/cdpQuadtree so its hard to imagine this has anything to do with dynamic parallelism in general.
I’m on a CentOS machine with a GeForce GTX Titan X device.