Cache operators

iamkaka · June 15, 2016, 10:30pm

As stated in the section 8.7.8.1 of PTX ISA documents, we can use cache operators in memory load and store instructions. http://docs.nvidia.com/cuda/parallel-thread-execution/#cache-operators

By using .cg, we can bypass L1 cache.
Now i want to bypass both L1 and L2, is there any cache operators to meet this goal?
My situation is that i want to generate two global memory requests, and i want to make sure the two requests is not cached, hence i can measure the characteristics of DRAM.

The another question is that what exactly does “.cs” do?

Cache streaming, likely to be accessed once.
The ld.cs load cached streaming operation allocates global lines with evict-first policy in L1 and L2 to limit cache pollution by temporary streaming data that may be accessed once or twice. When ld.cs is applied to a Local window address, it performs the ld.lu operation.

Robert_Crovella · June 15, 2016, 10:46pm

No, it’s not possible to disable or bypass L2.