How to setup "function cache configuration"?

cudaFuncSetAttribute(my_kernel, cudaFuncAttributePreferredSharedMemoryCarveout, 61);

I uses this, because I am in A100 and I want 100KB shared memory and 64KB L1, so I set it as 61%(100/164)

But I still see this in my ncu(2023.03 version) : CachePreferNone


Well, although maybe compiler will automatically do it for me, I guess…I uese NCU and see in occupancy calculator, it is 100KB for shared memory. But that does not explicitly show my! kernel is using this clever 100-64KB split and I feel not that comfortable…