Kernel functions actual parameters

I have a python program that make a use of high level libraries (PyTorch,hugginface).

I’m trying to use the common tools(nsight,ncu) to obtain the kernel functions parameters actual values that executed with the specific kernel functions that executed by this libraries.

How can I get this data? why its hard to get this parameters values?

Daniel

Still hasn’t got any answer or clue…

in interactive mode, ncu shows the CUDA API trace with function arguments.

,ID,API Name,Details,Func Return,Func Parameter,Start,Duration,Queued,Submitted
:,548,cudaLaunchKernel,,cudaSuccess(0),"(0x563d566ca630, {262144,1,1}, {256,1,1}, 0x7ffdf18166d0{0x7ffdf1816650}, 0, 0x0)",,,,
  :,549,cuLibraryLoadData,,CUDA_SUCCESS(0),"(0x7ffdf1816168{0x563d849dc400}, 0x563d5676f058, 0x7ffdf1816170{NONE(2215319712)}, 0x7ffdf1816190{0x563d840b18a0}, 0, 0x7ffdf1816160{CU_LIBRARY_BINARY_IS_PRESERVED(1)}, 0x7ffdf1816180{0x1}, 1)",,,,
  :,550,cuCtxPushCurrent_v2,,CUDA_SUCCESS(0),(0x563d84006360),,,,
  :,551,cuLibraryGetModule,,CUDA_SUCCESS(0),"(0x7ffdf1816270{0x563d849d8a30}, 0x563d849dc400)",,,,
  :,552,cuCtxPopCurrent_v2,,CUDA_SUCCESS(0),(0x7ffdf1816278{0x563d84006360}),,,,
  :,553,cuModuleGetFunction,,CUDA_SUCCESS(0),"(0x563d849d3088{0x563d849d9fd0}, 0x563d849d8a30, ""_Z6kernelPii"")",,,,
  :,554,cuLaunchKernel,,CUDA_SUCCESS(0),"(0x563d849d9fd0, 262144, 1, 1, 256, 1, 1, 0, 0x0, 0x7ffdf18166d0{0x7ffdf1816650}, 0x0)",,,,
  :,555,kernel,kernel,cudaSuccess(0),,,,,

1 Like

Thank you so much for you answer! sorry about the newbie question but the green value present here for example: “(0x563d566ca630, {262144,1,1}, {256,1,1}, 0x7ffdf18166d0{0x7ffdf1816650}, 0, 0x0)”
It’s the memory addresses of the parameters values, am I right?

Those are the function arguments. For cudaLaunchKernel, the arguments are:

__host__ ​cudaError_t cudaLaunchKernel ( const void* func, dim3 gridDim, dim3 blockDim, void** args, size_t sharedMem, cudaStream_t stream )

You would need to attach a debugger during interactive profiling and manually dereference the args pointers which contain the kernel parameters, using the address given from ncu. Note that the number of kernel arguments is unspecified.

1 Like

There’s no way to print/log/output to file all the parameters values that stored in this addresses? if not, do you have any tip of how to do it automatically and not manually through debugging?

One of my needs for example: In a model training session ,In every mat multiple kernel function that is activated while training, I want to log the actual matrix values.

And again, thank you so much for helping me, I appreciate that a lot!!!

I do not how to achieve this automatically

1 Like

Write the data into global memory or even pinned host memory in your kernel (with the address stored in __constant__ or __device__ memory so that you do not need to transfer it as parameter).
And then store/log that memory into a file on CPU.