call stack on the cpu side for a kernel

hzhang86 · November 29, 2017, 9:37pm

Hello,

I’ve played with cuda-gdb and nvprof tools for some time, one thing I could never retrieve is the call stack on the cpu side for a particular kernel. When in cuda-gdb, backtrace gets you the call stack on the GPU side but now stack information on the CPU side before the launch of that kernel. I know it’s due to the asynchronous execution of CPU and GPU, but is there a way to know where a kernel is called on the CPU side?? The cuda profilers like nvprof does not provide such information, either, which is common for a pure CPU program.

What is the major limit to provide such information here? Any ideas of how we get it?
Thanks a lot.

veraj · December 1, 2017, 2:24am

Hi, hzhang86

Please check if you can get what you need when using “nvprof --cpu-profiling on”

You can also use nvvp and enable “Profile execution on CPU”

hzhang86 · December 1, 2017, 2:59am

Hello, Veraj

I’ve tried both, but neither of them gave me what I want. The information of CPU side and GPU side is divided. I want to see an inclusive profiling result, meaning at certain hot spot in the code, I want to know the complete calling context of it, from the CPU call stack to GPU’s.

Here, I just want to know more about how the kernel ID value is assigned in cuda-gdb, is it a monotonically increasing integer for each kernel call? e.g., if we have the following code:

void my_kernel1(){}

void my_kernel2(){}

int main () {
my_kernel1<<<>>>();
for (int i=0; i<3; i++)
my_kernel2<<<>>>();

return 0;
}

======================================
Are the kernel IDs of 4 kernel calls (one for my_kernel1 and three for my_kernel2) predetermined as 0,1,2,3 ? Once a kernel call is finished, will its ID be reused for future kernel calls?

veraj · December 1, 2017, 3:08am

Hi,

I have posted your question to our dev.

I will reply once I get any response.

Topic		Replies	Views
Kernel Runtime CUDA Programming and Performance	9	5698	July 9, 2008
Focused profiling with nvprof not working? Visual Profiler and nvprof	1	1176	May 28, 2020
Kernel call stack Profiling Linux Targets	6	1011	March 21, 2023
Profiling inside a kernel CUDA Programming and Performance	1	2263	May 8, 2009
How does the CPU know when a GPU kernel finishes? CUDA Programming and Performance	4	2316	September 25, 2014
Call stack is visible/captured only for some CUDA kernels (broken backtraces) Profiling Linux Targets	5	1431	December 29, 2022
Is it possible for a kernel to know on which GPU device it is running? CUDA Programming and Performance	2	1075	October 12, 2021
How does CPU know that GPU kernel is finished? CUDA Programming and Performance	1	655	January 17, 2019
What is the defferent between"GPU activities" and "API calls"? Legacy PGI Compilers	3	3193	June 4, 2019
How to know a kernel actually starts running in Cuda C++? CUDA Programming and Performance	3	1575	April 26, 2019

call stack on the cpu side for a kernel

Here, I just want to know more about how the kernel ID value is assigned in cuda-gdb, is it a monotonically increasing integer for each kernel call? e.g., if we have the following code:

Related topics