I have a CPU split into 8 threads. I want to split up the resources of my GPU evenly among each thread. I have a GPU capable of having 65536 blocks. I would like to have 8000 blocks set aside for each thread of my CPU.
What is the best way to go about this? My algorithm knows which CPU thread it is on. I have two ideas for how to do this but don’t know which if either would work.
-
Is it possible I could specify which blocks to use on my GPU? i.e. 0-7999 for CPUThread 1, 8000-15999 for CPUThread 2…
-
Is it possible for the different CPU threads to each call the GPU kernel method individually (but in parallel)? Example below:
global void kernel( float* a, float *b, float *c, int *CPUThreadIndex )
{
if ((CPUThreadIndex * 8000) <= blockIdx.x && blockIdx.x < ((CPUThreadIndex + 1) * 8000))
{
//Do stuff on the kernel
}
}
void main(float a, float b, int CPUThreadIndex)
{
…
kernel<<<64000, 1>>>( dev_a, dev_b, dev_c, dev_CPUThreadIndex);
…
}
If each seperate CPU thread goes to the main function around the same time would something like the method above work in parallel?
Thanks for any help you guys can provide.