Atomic operations and Block communication

Sarnath · December 10, 2007, 5:23pm

I need a way to communicate data between blocks. The kernel has many blocks each of which operate on some data and arrive at some results. And, Each block has to communicate some results (say 5 integers) to the previous block.

Like this: Block n needs to communicate data to Bloc (n-1)
Block (n-1) needs to accept data from Block N and communicate some
data to Block (n-2)
And so on.

Will this be possible by making use of “Atomic” operations?

I am interested in the case where the number of blocks is more i.e. more than what it takes to keep all the 16 multiprocessors busy. Thus, you may have a block that would be ready to accept data BUT the “producer” block would NOT have been scheduled yet and so on.

Any thoughts?

seb · December 10, 2007, 8:58pm

This has been discussed here many times. The search function of this forum will help you find thoughts and ideas.

In general I think it is save to say that it is very cumbersome because the architecture is not intended to do something like this.
I think it is possible for very specific problems but I don’t think there’s a solution to the problem you mention (block not scheduled but data from block required) because we have no influence on how the GPU schedules blocks. You could order the blocks so that this won’t happen. You could also try something like this: [url=“The Official NVIDIA Forums | NVIDIA”]http://forums.nvidia.com/index.php?showtopic=53009[/url]

Depending on your algorithm it might also be possible to redesign it to circumvent the need for global synchronization.

Sarnath · December 11, 2007, 2:11am

ok. I see compute capable 1.1 devices have these atomic operations.

Were they introduced to synchronize between blocks? I know it syncs between threads. I would assume that it applies to threads from different blocks too. No???

What do you mean by “Order the blocks” ??? Is it possible to specify an order of execution among blocks???

seb · December 11, 2007, 2:32am

The atomic operations are what they are. Atomic operations. I don’t see how and why atomic operations would synchronize between blocks. Documentation states “it [the atomic operation] is guaranteed to be performed without interference from other threads”. I don’t know if it synchronizes all running threads (I have no 1.1 capable device to play with) but if it does I would guess it synchronizes only running threads.

I’m sorry I guess my statement about block ordering was misleading. No it is not possible. But there is a method described in the thread I mentioned how something similar to block ordering can be achieve. You basically don’t order the blocks but assign the work in the order you want it executed.

Topic		Replies	Views
Ideas on data transfer between blocks? CUDA Programming and Performance	1	964	April 10, 2009
Need synchronization between blocks? CUDA Programming and Performance	3	3086	September 16, 2009
Synchronization problem How can we synchronize blocks? CUDA Programming and Performance	10	5259	December 4, 2007
Communication between Threads on different multiprocesors CUDA Programming and Performance	2	2492	December 18, 2007
Block/CTA Scheduling CUDA Programming and Performance	8	6645	October 24, 2024
Block sheduling and L1 cache update ...about block synchronization CUDA Programming and Performance	5	939	April 22, 2011
atomicCAS CUDA Programming and Performance	8	3512	July 4, 2011
Interactions among blocks CUDA Programming and Performance	11	11461	February 6, 2010
Block Synchronization CUDA Programming and Performance	1	1022	July 13, 2010
Q: read/writing data by multiple threads CUDA Programming and Performance	4	2360	July 15, 2009

Atomic operations and Block communication

Related topics