Hi, according to the video https://www.youtube.com/watch?v=kKANP0kL_hk , we can view kernel launch args in nsight system:
However, I don’t see it in my nsight system. I can only see kernel names.
I’m using Version: 2024.2.1.106-242134037904v0 OSX.
My profile data are run with nsys profile ./myprogram
How can I turn on the config? Or is it even possible now?
Note: if I export the data to json, I can find some kernel launch related data:
{"Type":79,"CudaEvent":{"startNs":"560857681","endNs":"560859857","correlationId":123,"deviceId":0,"contextId":"1","streamId":"7","eventClass":3,"globalPid":"292907844108288","kernel":{"demangledName":"592","shortName":"593","mangledName":"587","eventCategory":"592","gridX":4,"gridY":1,"gridZ":1,"blockX":256,"blockY":1,"blockZ":1,"staticSharedMemory":0,"dynamicSharedMemory":0,"localMemoryPerThread":0,"localMemoryTotal":127401984,"gridId":"1","registersPerThread":16,"sharedMemoryExecuted":32768,"cacheConfig":1,"launched":1,"sharedMemoryConfig":0,"sharedMemoryLimitConfig":0}}}
It is in the hover tooltip in the GUI.
CPU api side example:
GPU side example:
Greg
March 22, 2024, 9:58pm
4
The launch parameters such as GridDim, BlockDim, Stream, DynamicSharedMemory, … are captured. The __global function parameters are not captured.
Really? I still cannot see launch parameters in your screenshot:
Did I miss anything? Here is a zoom-in of my GUI:
Greg
April 7, 2024, 6:19pm
7
HtoD memcpy does not have the GridDim, BlockDim, … as it is a command to the asynchronous copy engine.
The Call to act_and_mul_kernel is does not have the GridDim, BlockDim, … as you are on the API call not the GPU workload. The term “Kernel launcher” indicates you are on the API call.
If you mouse over the GPU workload in the rows
[<processId>] <process name>
CUDA HW (<pcie> - <gpu name>) <<-- here
Kernels <<-- here
you will see a tooltip with the information.
1 Like
system
Closed
April 22, 2024, 5:30am
9
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.