What are the limitations of using cudaMemcpyPeerAsync? Such as GPU model, motherboard type and so on? Where can I find the restrictions? Is there an indication on the parameters of the graphics card that this feature is available? Or more directly, where can I find a list of hardware devices that support this feature?
There isn’t any collated information provided by NVIDIA to make this determination.
The runtime method to determine this is cudaDeviceCanAccessPeer()
.
For casual usage, you should use that method. For acquisition purposes, either confirm with an actual configuration tested, or make it a condition of purchase.
It’s possible to make various generalizations, which you can find with a bit of searching, but nearly all the generalizations I’m aware of change over time, or stated another way have exceptions: 1 2 (not an exhaustive list)