Selective Broadcast Mechanism is it possible?

left side shows regular broadcast, the right side is what I would like to do

“x” is just some garbage elements

is it possible?
or is there an efficient alternative?

Shared Memory offers a broadcast mechnism, but not a selective one.

You have to implement it yourself.

consant memory can also be very fast when all threads are reading the same value.

But i guess in your case you won’t be reading in multiples of 32 so this might not be the solution for you.

Hmm this is just some quick idea i got… Maybe it could be fast to let ONE warp do 3 broadcasts that overlap each other? Some psuedo code

__shared__ float array[256];

__shared__ float vals[6*8]; //  Your result (added a little extra length)


int offset = 256/3; // right ?

// broadcast one

vals[threadIdx.x] = array[0]; 

// broadcast two

vals[threadIdx.x+ 8] = array[offset]; 

// broadcast three

vals[threadIdx.x + 16]  =arrat[2*offset];

I guess this might be a fast way to do it. You might have to think about the indexing more than i did but you get the basic idea?

Shared memory requests are per half-warp, consisting of 16 threads. There are no conflicts whatsoever between threads coming from different half-warps, and broadcast is done per half-warp too.

I think what Nikolai wants to do is:

vals[threadIdx.x]=array[threadIdx.x & ~0x7];

This memory operation will be resolved with a use of a broadcast mechanism but will be resolved in two memory requests (as if we had 2-bank conflict). A 2-way bank conflict is not much.

Jimmy’s introducing two transactions explicitly which is not helping.

My statement is motivated by what can be read in the Programming Guide:

Following this I understand that in our case, first half-warp can be serviced for example as follows:

Request 1: array[0] as broadcast and array[8] ordinarly. Threads 0-9 are done.

Request 2: array[8] as broadcast, threads 10-15 are done.

reading your replies gave me an idea that circumvented the whole need for the “selective broadcast mechanism”

thanks guys! :thanks: