I am curious if there is a way for a programmer to force something to be in L1 cache? Are there specific qualifiers in the CUDA programming model that put a variable in L1 cache? I believed what goes in the caches is determined by the hardware, but since we can explicitly put information in shared memory, is there also a way to control what goes (or stays) in L1 cache?
You cannot force L1 residency (“persistence”) currently in CUDA GPUs. There is some limited ability to do this with the L2, on newer GPUs.
Of course you can control what goes into the L1, or not, using caching modifiers. But you don’t get control over the L1 eviction policy.
Maybe I am missing something, but none of the cache operators apply to L1 cache specifically. Is that correct? cg would apply to L2 and below, and ca appears to apply to all levels. Is it that using ca defaults to loads in L1 cache?
Yes, the PTX ISA offers more detail.
They are hints. A hint is not guaranteed to do anything.
In practice, __ldcg inhibits caching of global loads in the L1. In practice, __ldca encourages caching of global loads in the L1.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.