It is about cublasDx library

CUDALibrarySamples/MathDx/cuBLASDx/introduction_example.cu at master · NVIDIA/CUDALibrarySamples · GitHub. i copied this file into my centos7 ,cuda12.0,gcc-10.1,c++17 environment. i encountered a problem ,that is when i
called “gemm_kernel_shared” ,program reported "out of memory " but when i called “gemm_kernel_registers” ,program is running successfully.The size of all matrix is 8*8. i dont know why .thanks