Texture access performance

My application has closely spaced accesses. Texture fetches from linear memory are faster than 2D arrays, but only by 2% or so. A few more % of total performance is obtained since using textured linear memory removes the need for device-device memory copies between iterations.

I did not try 1D arrays.