I see thank you very much for the explanation! I’ve wondered why CPUs don’t use a L1-cache/shared memory combined approach, and let the programmer explicitly place data in the cache. It seems to be very helpful to have both automatically HW managed cache and programmer controlled cache, like shared memory in GPUs so that when we do need explicit cache control it’s at our disposal. Any reasons CPUs are not designed that way?
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| When will we want to use L1? | 6 | 231 | August 16, 2024 | |
| Issues about L1 cache | 10 | 300 | February 26, 2025 | |
| How to optimize for cache + shared memory on Fermi? | 8 | 3161 | April 25, 2010 | |
| global memory caching | 4 | 1513 | March 13, 2012 | |
| More Shared Memory by disabling L1 Cache? | 3 | 1341 | February 24, 2013 | |
| What's the difference between L1 cache and the shared memory | 4 | 15650 | October 29, 2011 | |
| Cache behavior when loading global data to shared memory in Fermi | 1 | 1062 | April 30, 2013 | |
| No performance inprovement shared mem x global mem | 5 | 1246 | April 26, 2013 | |
| Shared/cache memory management for HPC with large data required per thread | 6 | 1231 | May 10, 2017 | |
| ptx-isa cache operator to say "L1 only" making shared memory redundant | 0 | 3308 | January 3, 2012 |