I am trying to use the new features of NVLink, such as coherence. But I got some questions:
- Is hardware coherence enabled between two GPUs connected with NVLink? If not, how to turn it on? I tried a test program, and coherence is supported.
- What is the relationship between unified virtual memory and NVLink coherence? I tested this using a small program. It seems unified virtual memory overwhelms NVLink coherence, if the memory is allocated by cudaMallocManaged. The coherency is guaranteed by unified virtual memory.
- Do you have some suggestions when I should use unified virtual memory or NVLink coherence, in terms of performance? Do you have some examples?
Thank you so much!