My understanding is that the hardware uses a special format to store textures and arrays (some sort space filling curve) which is different from the linear memory layout in CUDA. When we bind linear memory to a 2D texture, does a copy get created with the texture/array internal format? Or does the hardware operate on the linear memory with reduced performance (probably sub-optimal caching performance…)? Interpolation still seems to work.
I’m trying to save memory by avoiding having two copies of the same data, one as an array and the other linear, but if it happens anyway…there isn’t much point.