I am new to cuda, and wanted to do some small test. If I have a grid of two thread blocks, and want them run on the same multiprocessor, how can I do that?
myKernel<<<dimGrid, dimBlock>>>(arg1, arg2);
doesn’t seem to have control over whether I put those two thread block on the same multiprocessor or not.
PS: the GPU I am using is 8800 GTX
PPS: this questions has been posted in the general discussion subforum before I realised there is a programming subforum. I hope this re-post doesn’t violate the forum regulations.