Question about texture/shared memory enhance the computing efficiency

ypyp · December 3, 2007, 2:08pm

Hi
I am new to CUDA, after read the programming guide 1.0, I got confused for texture memory and shared memory. From the document, it claimed that using shared memory can much improve the computing efficiency, and it also mentioned that Reading device memory through texture fetching can be an advantageous alternative to reading device memory from global or constant memory.

My question is, which one is better? Can I combine them to get better performance? Now I refered the sample code provided by NVIDIA and successfully to execute median filter with the method reading device memory through texture fetching. Just wander if there any way to get higher performance.

Thank you for your reply.

MisterAnderson42 · December 3, 2007, 2:40pm

Shared memory can be used as a very effective cache when a block of threads cooperate because they can read/write to the same fast shared memory pool. When reading values into shared memory, they can be read from constant, texture, or normal global memory: it doesn’t matter.

Texture access to global memory is useful when you simply cannot change your algorithm to get coalesced global memory reads. It is still of the utmost importance to have all texture reads within a warp have good spatial locality in the memory to get good performance, though.

nwilt · December 3, 2007, 2:50pm

One isn’t better than the other, they are different and complementary. Usually you can look at a given application and solidly conclude that one is more suitable than the other. There’s isn’t that much overlap since shared memory is on-chip and instanced by the hardware on a per thread block basis, while texture aliases memory that was allocated by the application.

They can be combined for better performance. The sobelFilter application in the SDK uses texture to stage pixel data into shared memory to take advantage of data reuse. The pixel accesses from shared memory are much lower latency than they would be from texture, and shared memory can deliver more data per clock.

Occasionally, you might find applications (like lookup tables) where the tradeoffs aren’t quite as clear. When that happens, often one of the resources is otherwise underutilized by the kernel so you tend to migrate to the other. If either is suitable and you really don’t know which is better, you should make the determination empirically; but I’d expect that to be rare.

ypyp · December 4, 2007, 4:06am

One isn’t better than the other, they are different and complementary. Usually you can look at a given application and solidly conclude that one is more suitable than the other. There’s isn’t that much overlap since shared memory is on-chip and instanced by the hardware on a per thread block basis, while texture aliases memory that was allocated by the application.

They can be combined for better performance. The sobelFilter application in the SDK uses texture to stage pixel data into shared memory to take advantage of data reuse. The pixel accesses from shared memory are much lower latency than they would be from texture, and shared memory can deliver more data per clock.

Occasionally, you might find applications (like lookup tables) where the tradeoffs aren’t quite as clear. When that happens, often one of the resources is otherwise underutilized by the kernel so you tend to migrate to the other. If either is suitable and you really don’t know which is better, you should make the determination empirically; but I’d expect that to be rare.

[snapback]288536[/snapback]

Thank you for your reply, it is clearer now.

Topic		Replies	Views
Copy from texture memory to shared memory Confused about best transfer strategy CUDA Programming and Performance	4	1587	February 11, 2010
Texture? Just a short lesson... CUDA Programming and Performance	5	2733	March 9, 2008
Shared Memory usage slows kernel with texture fetch CUDA Programming and Performance	8	4179	June 20, 2011
Shared Mem caching strategy Comparison of benchmark results CUDA Programming and Performance	9	4225	May 11, 2008
texture memory or shared memory? which is faster, and by what factor? CUDA Programming and Performance	0	1155	March 14, 2008
texture memory vs global memory CUDA Programming and Performance	10	13831	August 20, 2007
Shared memory vs texture fetches CUDA Programming and Performance	0	1920	April 26, 2007
Best option with very few neighbor reads Shared or texture memory? CUDA Programming and Performance	3	1307	January 13, 2010
When to use textures CUDA Programming and Performance	7	8158	February 12, 2008
For what case should I use texture memory? CUDA Programming and Performance	8	2689	May 26, 2010

Question about texture/shared memory enhance the computing efficiency

Related topics