I made a kernel to access shared memory.
I executed it 5 times.
Access time
(1) 0.180000 ms
(2) 0.021000 ms
(3) 0.014000 ms
(4) 0.014000 ms
(5) 0.014000 ms
I think that shared memory access time must be same.
I don’t know this reason.
[source code]
global void speed_check(…)
{
[indent]shared int s_m[1024];
…
[indent]for ( i = 0; i < 1024; i++ )
{
tmp += s_m[i];
}[/indent]
…[/indent]
}
int main()
{
[indent]…
[indent]for ( i = 0; i < 5; i++ )
{
cutStartTimer(timer[i]);
speed_check<<< 1, 1 >>>(…);
cudaThreadSynchronize();
cutStopTimer(timer[i]);
}[/indent]
…[/indent]}