Question about bandwidth between l2 cache and l1 cache

namch0101 · April 14, 2025, 7:10am

Hello, I have a question about architecture of GPU.
I want to ask you if there can be any interference between two different processes on bandwidth between L2 cache and L1 cache. I’m not sure if L1 cache shares same bandwidth with each other since they are in seperate SMs.
I’m also curious if bandwidth between l2 cache and l1 cache is same with the bandwidth between l2 cache and device memory.

Thank you in advance!

Curefab · April 14, 2025, 8:02am

The L1 caches from different SMs are independent from each other. Only if several have to wait for L2 can they can have an effect on each other.

L2 to L1 bandwidth is about 2x-4x (depending on GPU model) faster than L2 to global device memory. (With “to” meaning read and write and not indicating a direction.) There are no (neatly published) official numbers, but you can look at 3rd party benchmarks or the Dissecting … Architecture papers. Or look at Compute Nsight, how much bandwidth of the caches was occupied.

It may be that within each SM L1 shares bandwidth with shared memory, as they are based on the same silicon.

namch0101 · April 15, 2025, 8:04am

thank you for answering.
I didn’t know L1 cache only approach to L2 cache one by one. or is it that there is a certain circumstance that l1 cache only can approach to l2 one by one and other than that circumstance, can they approach to l2 cache all together? I think it might be hard to fully utilize bandwidth if L1 cache approach to L2 cache one by one all the time.

Topic		Replies	Views
Bandwidth about shared memory and l1 cache CUDA Programming and Performance	2	245	November 12, 2025
FERMI L1 Information Associativity, Access Pattern CUDA Programming and Performance	3	1476	November 15, 2011
Do dedicated shared memory and unified L1/Texture cache share the same bandwidth (Pascal)? CUDA Programming and Performance	5	2440	January 20, 2017
Difference between L2 read/write transactions and L2_L1 read/write transactions ? CUDA Programming and Performance	3	1754	August 28, 2019
L1 bandwidth benchmark CUDA Programming and Performance	1	988	May 9, 2021
L2 Bandwidth Value for A100 Calculation CUDA Programming and Performance	5	270	January 28, 2025
I have a question about GPU memory bandwidth CUDA Programming and Performance	7	320	August 10, 2025
Performance problems with NVLink and L2 cache CUDA Programming and Performance	6	1642	September 26, 2022
How to communicate between blocks? CUDA Programming and Performance	8	1168	March 9, 2024
How to reduce L2 fabric bandwidth? CUDA Programming and Performance	2	475	December 25, 2023

Question about bandwidth between l2 cache and l1 cache

Related topics