I tried “set cuda break_on_launch system” for vectorAdd and simpleCUBLAS testbench from SDK either with “-g -G” or without, but it does not break on cuda libraries such as cudaMalloc or CUBLAS kernels. Do I do this correctly or I may need more configurations to debug system kernels. Thanks! My CUDA version is CUDA 8.0
Is it possible to debug inside dynamic linking library kernels or I should build it as static?
For instance, can I debug cuBLAS or cuDNN library kernel with “set cuda break_on_launch system”