Copying data simultaniously

Hi all,
I got a quite specific problem, so here is what I want to do:

I have a BitIndex and an input array consisting of characters. The bit index marks the positions of the chars in the array that need to be copyed to a new array. It looks like this:

0 0 0 1 1 0 1 0 … BitIndex
c0 c1 c2 c3 c4 c5 c6 c7 … Char Array

c3 c4 c6 … Result

Any idea how i could realize this in CUDA? Especially considering that the Char Array could be very large and that it might take several blocks to process it.

Thanks for your helpPicknick3r

List compaction, right?
could be done using scan+scatter
also, could store this in a PBO and use geometry shader… though I never got this stable enough to be useful in the 0.8 era

Yeah list compaction. Could you be a bit more specific with scan+ scatter.

One idea was to partition the bitindex into areas of equal amounts of 1s and then let these partitions be processed by blocks. But there is still the problem that a thread can’t know where (which position in result array) to write the char.

The approach with the geometry shader sounds interesting. You got some exemplayry coder or sth. cause im quite new to all this stuff and need good examples.
What were your problems back when you tried it?

Thx again


See this thread:…ndpost&p=197515