Some confusion on using shared memory.

I did not go through your code.

but to benefit from shared memory - you need to make it act like a cache.

i.e. you store global memory values inside shared memory and then re-use it from shared memory extensively. Then it makes lotta sense.

If u r gona use a gmem variable only once in lfe time. then staging it in smem does NOt make sense…

Apply thos fundaa to ur code and see if it answers your question

Whatever data that you think you will need frequently should be staged in shared memory explicitly. After computation, store the results back in global memory and then fetch the next set from global memory to shared and do the same.


hydroxycut
tonka

Whatever data that you think you will need frequently should be staged in shared memory explicitly. After computation, store the results back in global memory and then fetch the next set from global memory to shared and do the same.

In my code I am am try to this (store global memory values inside shared memory and then re-use it from shared memory ) but I cannot get performance improvement because , card that has compute capability less than 1.2 is not able to do 8-bit memory coalescing . In my code I am using char array and I am trying to coalescing its access.

can you give me any hints for handling this char array (inside device function) so that I will get performance improvement?

I think I did. I asked you to access them using “int *” pointers only while fetching from global memory.
Once they are in shared memory, use them as “characters”

Hi Sarnath,

I am fetching unsigned char array as int* and when it is in shared memory then I used it as an unsigned char, but it did not give correct output.

Possibly because you did not code it right.