Switch off L1 cache

Greg · March 24, 2015, 8:00pm

If the instructions memory access is highly divergent and address ranges are only access 1 time then there can be bandwidth savings associated with performing uncached global loads. Caching can be controlled on a per instruction basis using inline PTX. The L1 cache can also be disabled using the compiler option -dlcm.

For more information see the Global Memory section in the CUDA Programming Guide for the compute capability of your GPU.

http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#global-memory-2-x
http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#global-memory-3-0
http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#global-memory-5-x

If you are developing for a compute capability 3.5 device you may also want to investigate the LDG instruction which performs read-only global access through the texture cache. The texture cache can have better performance for highly divergent memory accesses and if the application is heavily accessing shared, local, or global memory.

All device memory accesses always go through L2.

All system memory accesses currently are not cached in L2.

There are no additional controls for L2.

Topic		Replies	Views
Bypassing cache while running a benchmark CUDA Programming and Performance	1	658	April 27, 2016
Disabling cache on Fermi architectures Try to disable L1 and L2 CUDA Programming and Performance	11	9259	August 30, 2013
L1 Cache, L2 Cache and Shared memory in Fermi CUDA Programming and Performance	5	23526	March 21, 2011
How can I check and see if my GPU is using L1 cache CUDA Programming and Performance	7	2966	June 9, 2011
How to force the GPU to drop its cache? CUDA Programming and Performance	1	449	October 27, 2017
variable cache line width ? CUDA Programming and Performance	4	2018	January 13, 2015
Anyway to force several bytes to be in L1/L2 cache so that I can use it across multiple threadblocks within one kernel? CUDA Programming and Performance	2	445	June 24, 2022
How can I make Quadro K420 skip L1 and L2 caches when loading a variable? CUDA Programming and Performance	3	949	April 8, 2018
Bypassing cache in Fermi CUDA Programming and Performance	16	4782	August 28, 2010
L1-L2-Global how to clearly describe their interaction for a given kernel CUDA Programming and Performance	3	2065	April 15, 2012

Switch off L1 cache

Related topics