What's the meaning of `Shared Memory executed` in nsys

Hi,

Here is a result in nsys gui when I click one kernel:

Begins: 8.28491s
Ends: 8.35452s (+69.615 ms)
grid: <<<64, 64, 1>>>
block: <<<16, 16, 1>>>
Launch Type: Regular
Static Shared Memory: 8,192 bytes
Dynamic Shared Memory: 0 bytes
Registers Per Thread: 127
Local Memory Per Thread: 0 bytes
Local Memory Total: 205,258,752 bytes
Shared Memory executed: 65,536 bytes
Shared Memory Bank Size: 4 B
Theoretical occupancy: 25 %
Launched from thread: 1363913
Latency: ←642.541 ms
Correlation ID: 263
Stream: Default stream 7

What’s the meaning of Shared Memory executed and what’s the difference between it and Static Shared Memory?

@skottapalli

“Static Shared Memory” refers to the amount of shared memory that is statically allocated for a kernel during compilation.

“Shared Memory executed” is Shared memory size set by the driver.

Nsight Systems gets the information about kernels from CUPTI. Please see 6.46. CUpti_ActivityKernel8 — Cupti 13.0 documentation for more details on the information reported by CUPTI for each kernel.

This is the shared memory carve out size executed by the driver. This may differ from the requested size. This is not the thread block shared memory size.

Thanks! @skottapalli @Greg

Yes, I found that the ‘Static Shared Memory’ metric is consistent with the size I allocated in my code, while ‘Shared Memory executed’ is much larger.