I am running into some problems with the new Titan Xp on Windows 10. I tried looking for information on the web as well as this forum but there are still some questions that I have:
From what I recently learned, first of all, concurrentManagedAccess = 0 on Pascal architecture is a bug in CUDA 8 on Windows platform and is going to be fixed in CUDA 9.
What does concurrentManagedAccess = 0 really mean?
a. Does it mean that I cannot access managed memory from multiple devices (including the host) concurrently?
b. So in this case what happens when I allocate memory using cudaMallocManaged, does page migration still work for non-concurrent access or do I have to move the memory manually somehow (because cudaMemPrefetchAsync requires concurrentManagedAccess to be 1)?
c. Will the memory be accessible from host and device if I guarantee they do not access it at the same time (for read and write)?
Even with concurrentManagedAccess = 1, does it mean I can access managed memory (read only) from host code and device kernel concurrently and also from more than one device? Is the behavior also the same for Kepler and Maxwell GPUs except for the fact that in the older GPUs it is done by the driver (software driven) rather than by the hardware (page fault and migration engine) as in Pascal?
(I might have more question based the answers of the above questions :))