-Xptxas -dlcm=cg disables caching in the L1
-Xptxas -dlcm=ca enables caching in the L1
caching at all levels (what ca hint means) is the default behavior, at least for cc 8.0. It also means that for a PTX instruction like ld, which could have a caching modifer, if the cache hint is omitted, then it is as if the instruction were written as ld...ca1 2
It might be instructive to start with an arch that doesn’t have the “new” mnemonics. If I compile your code for sm_52 (default) and use cuobjdump -sass, I get:
/*0028*/ LDG.E R2, [R2] ; /* 0xeed4200000070202 */ // default
/*0028*/ LDG.E R2, [R2] ; /* 0xeed4200000070202 */ // -dlcm=ca
/*0028*/ LDG.E.CG R2, [R2] ; /* 0xeed4600000070202 */ // -dlcm=cg
So this lines up with expectation. The default and the -dlcm=ca case are the same (L1 caching is enabled by default, and is disabled in the cg case via an alternate opcode that has .CG in disassembly)
Based on that I would conclude in the sm_80 case:
/*0040*/ LDG.E R2, [R2.64] ; /* 0x0000000402027981 */
/* 0x000ea2000c1e1900 */ // default
/*0040*/ LDG.E.STRONG.SM R2, [R2.64] ; /* 0x0000000402027981 */
/* 0x000ea2000c1eb900 */ // -dlcm=ca
/*0040*/ LDG.E.STRONG.GPU R2, [R2.64] ; /* 0x0000000402027981 */
/* 0x000ea2000c1ef900 */ // -dlcm=cg
that from an L1 caching perspective, LDG.E and LDG.E.STRONG.SM are equivalent. Both indicate L1 caching enabled for global loads.
That’s because the -Xptxas -dlcm=... switch applies to code generation after the creation of PTX. It is an override that affects the ptxas tool behavior, which converts PTX to SASS. Regarding the creation/generation of PTX, that is done before that switch is applied. You can certainly write your own PTX that uses the .ca or .cg hints, and you should expect to see that result in the generated SASS code. If you want to see the ptx generated by CUDA C++ compiler (i.e. nvcc) modified, then use the offered options for that.
If your question is or becomes “what do strong and weak mean?” then I would refer you to the relevant section in the PTX guide. It’s rather involved material, and I wouldn’t assume that a statement like
adequately captures the meaning. I’m not suggesting I intend to try to demonstrate comprehension or explain it all, either.
If your question is or becomes “what is the exact difference between these two opcodes, then”:
/*0040*/ LDG.E R2, [R2.64] ; /* 0x0000000402027981 */
/* 0x000ea2000c1e1900 */ // default
/*0040*/ LDG.E.STRONG.SM R2, [R2.64] ; /* 0x0000000402027981 */
/* 0x000ea2000c1eb900 */ // -dlcm=ca
I wouldn’t be able to answer that. SASS is not documented to that level.