stacktrace on gpu

Hello,

I’m new to CUDA and I’m working on a project that needs to get the stacktraces when we sampled the cuda execution. I have 2 specific questions:

  1. Based on what I learn, cuda-gdb supports backtrace on cuda execution, which means gpu does save stack frame information right? Currently I have to submit my job to gpu node for execution, then how can I use cuda-gbd on the submitted job?

  2. Is it possible for me to get the full stacktrace (host+device) at any point in the kernel? CPU based programs have many options like libunwind, but I’m not familiar with GPU right now and haven’t found any insights online, can anyone give some thoughts and ideas on that? I really appreciate it. :))