Shared load and store trancactions behavior



i made some experimentations on shared memory and there is something that is squeezing my head,

When i want to access 128 Bits data spreaded on the first four banks i got 4 accesses, i got it the thread can only handle 32 bit at a time.

When i create a conflict with another thread ( for exemple thread 0 in B0,0 B1,0 B2,0 B3,0 and thread 2 in B0,1 B1,1 B2,1 B3,1 where Bbank,row) i got 5 accesses, i presume it takes B0,0 then B0,1 and B1,0 then B1,1 and B2,0 then B2,1 and B3,0 then finally B3,1, with kind of pipeline access and we have n+3 transactions where n is the number of conflict, but when i achieve 9 conflicts i only got 11 transactions and this n+2 parten stand until 17 transactions when it becomes n+1 and finally it becomes n starting from 25 conflicts. With 64 bits accesses it hapened only at 17 where it goes from n+1 to only n.

My Question is why there is this leap at 9,17,25 (128 bits) and 17 (64 bits) ?

I give you my code :
to use it like i do
./a.out <size_data: either 32, 64 or 128> <acording to size data: 32 or16 or 8 to always perfom conflicts> 0 FFFFFFFF (4.6 KB) (1.1 KB)

I haven’t studied your code. However for a 128 bit load or store per thread, with each thread requesting adjacent data, I would expect “leaps” at 9, 17, and 25 threads.

With 8 threads per warp requesting 128bits/thread, you would have exactly one shared “transaction” per request. 9 threads would bump you up to 2 transactions per request. 17 threads would bump you up to 3 transactions per request, and 25 threads would bump you up to 4 transactions per request.

Likewise with 64 bits requested per thread, you would observe a bump at 17, going from one transaction per request to 2.

So these numbers and their relation to bits requested per thread and number of threads involved is not surprising to me.

The main point of my experiment is that these data are NOT adjacente to each other. basically they are on top of each other. I did understand the mecanism when they are next to each other. But here we want to focus on conflict.