Cuda Debugging


I am working on a kernel that finds minimum of 3x3 sub arrays of an array.

The problem is, i am having hardtime debugging the kernel which is actually not long. It began to feel like working on 70s computer, which you have to review your code over an over again manually and desperately seeking for the possible reason. I calculated like 100 times index numbers manually for every grid, indexes for every shared memory transfers… Still end up with meaningless result file.

So i am pretty sure it is not how pros do… Can you help me about methodology of debugging in GPUs…

Thanks in advance…

  1. Always use proper cuda error checking
  2. Run your code with cuda-memcheck. If cuda-memcheck reports kernel execution errors, add -lineinfo to the compile command to narrow the problem down to a specific line of source code that is causing the error
  3. You can use printf in-kernel
  4. Learn to use a cuda debugger

cuda-memcheck also has sub-tools that can be useful. Read the documentation.

Cross reference:

I thank for the reply…

I am trying to use & learn Cuda debugger and errorheck functions…