Linear Memory Vs CUDA Array for texture binding

Hi all,

I am trying to speed up my cuda program by reducing the amount of time spent in accessing memory from my kernels. Binding linear memory to a texture object allowed me to have a radical improvement over using global memory.
My question is: could I get better results (aka timings) if I bound the texture to a CUDA array instead of using linear memory? Or maybe Arrays are only used to have access to filtering functions and to more elaborate indexing mode in fetching?

Thanks in advance,
bye

For 1D accesses, Arrays only have the benefits you mention.

For 2D accesses with tex2D, Arrays have improved cache coherence when you read up and down columns as well as across rows.

did u try including lapack in cuda programm?