I am new to CUDA, after read the programming guide 1.0, I got confused for texture memory and shared memory. From the document, it claimed that using shared memory can much improve the computing efficiency, and it also mentioned that Reading device memory through texture fetching can be an advantageous alternative to reading device memory from global or constant memory.
My question is, which one is better? Can I combine them to get better performance? Now I refered the sample code provided by NVIDIA and successfully to execute median filter with the method reading device memory through texture fetching. Just wander if there any way to get higher performance.
Thank you for your reply.