mapped memory modified by cpu thead different frrom the one that launched the gpu

Suppose separate cpu threads aquire mapped memory and a gpu pointer to it.
HANDLE_ERROR (cudaHostAlloc( (void**) &PB->P, pbpsz,cudaHostAllocMapped));
HANDLE_ERROR( cudaHostGetDevicePointer(&PB->devP, PB->P, 0));

then a cpu thread starts a kernel on the gpu that uses such mapped memory while other cpu threads
are busy loading and unloading separate disjoint buffer areas of mapped memory.
I think the gpu dies when accessing some such mapped memory, although it is guarenteed
that the cpu thread is not reading or writing the block the kernel is using.

Is there some precaution that I can take to make the kernel access safely such a block
when other cpu threads are busy banging on different blocks?
I am careful that the above is correct, and no collsion occurs at the same time
the kernel is operating.
The block of memory are all disjoint, I made very sure of that too.

This whole effort is because I would like to extend some producer consumer code that works
fine in the cpu only context.

I wonder if some use of streams would help. The kernel can run, but dare not look at
using printf some memory items with failure, even if I can guartentee that the items printed
are not being modified. Maybe they are in differentl buffers, and maybe the buffer adderess
in the gpu address space, may be close together, but never operlapping.

Eventually I expect to have more than one gpu, and the cpu threads
may load mapped memory, and I know there is a stream parameter, so they cpu block loaders would
have to be sensitive to which stream the gpu is using.

~

Are you sure the problem is caused by the CPU modifying data in mapped memory? That surely should not be the case. Does the problem still occur if you have the CPU wait with its modifications until the kernel is finished? Is the execution path of your kernel data-dependent?