How can I use only part of the GPU? control over thread blocks dispatch

Hi all,

I am new to cuda, and wanted to do some small test. If I have a grid of two thread blocks, and want them run on the same multiprocessor, how can I do that?

The code:
dim3 dimBlock(2,8);
dim3 dimGrid(2,1,1);
myKernel<<<dimGrid, dimBlock>>>(arg1, arg2);

doesn’t seem to have control over whether I put those two thread block on the same multiprocessor or not.

PS: the GPU I am using is 8800 GTX
PPS: this questions has been posted in the general discussion subforum before I realised there is a programming subforum. I hope this re-post doesn’t violate the forum regulations.

Many thanks,

I’m not aware of any way to do that. If you want to see what happens when two blocks run on the same multiprocessor, you’ll have to run at least 17 blocks on the 8800 GTX.

There is no way to do that in current CUDA release.