Meaning of "Shared Memory Configuration Size"?

Hi there,

Nsight compute will report “Launch Statistics” for a specified kernel, which contains a “Shared Memory Configuration Size” field (launch__shared_mem_config_size).

e.g. (see link)

$ ncu python t1.py
    ...
    Section: Launch Statistics
    ---------------------------------------------------------------------- --------------- ------------------------------
    Block Size                                                                                                         64
    Grid Size                                                                                                          64
    Registers Per Thread                                                   register/thread                             48
    Shared Memory Configuration Size                                                  byte                              0
...

What’s the meaning of this field? Does it calculate from cudaFuncAttributes::preferredShmemCarveout and cudaDevAttrMaxSharedMemoryPerMultiprocessor?

Shared Memory Configuration Size indicates the shared memory size, in bytes, that is configured by the CUDA driver for this kernel launch, per block, taking into account all other configuration options and constraints set by the application, the CUDA driver or the HW. It is calculated by the driver and directly reported by the tool.

Thanks ~

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.