Why, when i am running a p_chase like algorithm on local memory and global memory, i got two different results ?
Indeed, on local window the cache block size appears to be 4 bytes, and on global window it appears to be 32 bytes.
Maybe, it’s because the coalescing mechanism ? or possibly, since Pascal L1 cache is split into two different region, maybe one is for local data and the other for global data (in extension with a separated constant cache and separated shared memory )?