I see thank you very much for the explanation! I’ve wondered why CPUs don’t use a L1-cache/shared memory combined approach, and let the programmer explicitly place data in the cache. It seems to be very helpful to have both automatically HW managed cache and programmer controlled cache, like shared memory in GPUs so that when we do need explicit cache control it’s at our disposal. Any reasons CPUs are not designed that way?
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
L1 Cache, L2 Cache and Shared memory in Fermi | 5 | 23484 | March 21, 2011 | |
No performance inprovement shared mem x global mem | 5 | 1150 | April 26, 2013 | |
life span of shared memory | 15 | 6943 | April 27, 2011 | |
Where do atomic operations go, and why are atomics to __shared__ faster than those to GMEM? | 6 | 2335 | July 11, 2022 | |
CUDA Refresher: The CUDA Programming Model | 2 | 652 | January 26, 2023 | |
General CUDA Questions New to CUDA and need some help! | 8 | 5976 | September 5, 2008 | |
Cant understand Shared Memory Concept ! I want to talk Live to somebody who knows it !!& | 2 | 1437 | April 13, 2009 | |
paging stratigies for global memory any paging strategy on the way for CUDA | 3 | 2217 | November 26, 2008 | |
Newbie - Need to use shared mem? | 27 | 14988 | December 17, 2008 | |
optimization shared memory fail major speed using shared memory in detriment of global memory | 3 | 3667 | March 31, 2011 |