Suppose separate cpu threads aquire mapped memory and a gpu pointer to it.
HANDLE_ERROR (cudaHostAlloc( (void**) &PB->P, pbpsz,cudaHostAllocMapped));
HANDLE_ERROR( cudaHostGetDevicePointer(&PB->devP, PB->P, 0));
then a cpu thread starts a kernel on the gpu that uses such mapped memory while other cpu threads
are busy loading and unloading separate disjoint buffer areas of mapped memory.
I think the gpu dies when accessing some such mapped memory, although it is guarenteed
that the cpu thread is not reading or writing the block the kernel is using.
Is there some precaution that I can take to make the kernel access safely such a block
when other cpu threads are busy banging on different blocks?
I am careful that the above is correct, and no collsion occurs at the same time
the kernel is operating.
The block of memory are all disjoint, I made very sure of that too.
This whole effort is because I would like to extend some producer consumer code that works
fine in the cpu only context.
I wonder if some use of streams would help. The kernel can run, but dare not look at
using printf some memory items with failure, even if I can guartentee that the items printed
are not being modified. Maybe they are in differentl buffers, and maybe the buffer adderess
in the gpu address space, may be close together, but never operlapping.
Eventually I expect to have more than one gpu, and the cpu threads
may load mapped memory, and I know there is a stream parameter, so they cpu block loaders would
have to be sensitive to which stream the gpu is using.
~