More Shared Memory by disabling L1 Cache?

I am desperate for more Shared Memory.

On 2.x architecture, each thread-block has 64kb memory that can be divided as 48kb shared / 16 kb L1 cache or vice-versa.

Other posts in this forum have claimed/asked/noted that the nvcc compiler flag “-Xptxas -dlcm=cg” will “disable the L1 cache line”. The PTX-ISA-3.1 Reference guide says

Cache at global level (cache in L2 and below, not L1).
Use to cache loads only globally, bypassing the L1 cache, and cache only in the L2 cache. As a result of this request, any existing cache lines that match the requested address in L1 will be evicted.”

Does this mean that all 64 kb will be available for Shared Memory? I would be very happy if this was the case!

Sadly, no. All that flag is doing is affecting the instructions generated by the compiler. Load instructions can be modified to bypass either of the caches, but that does not change the amount of space allocated to the L1 cache.


Thank you, seibert. Indeed it is too bad that we can’t do that.