I have a C program that uses CUDA driver API to compile and launch a hand generated PTX kernel. Is it possible for me to use cuda-gdb to trace the execution of the kernel at PTX instruction level? If not could anyone please suggest a tool that I can use to do so?
Thanks a lot for your help!