Hello,
I have found that cudaDeviceCanAccessPeer()
with report that devices cannot peer with themselves. This can be tested with a trivial program such as:
#include <cuda_runtime_api.h>
int main() {
int can = 0;
cudaError_t ret = cudaDeviceCanAccessPeer(&can, 0, 0);
if (ret != cudaSuccess) return 127;
return 1-can;
}
which will exit with error status 1.
I am aware that this is a very peculiar corner-case, but I would expect devices to be able to peer with themselves.
(The way I found this is by testing our multi-GPU code using the same device twice; our code creates one thread per device, and tries to enable peer access between the devices specified for each thread, but it fails in this case, even though I would expect it to work since the device is physically the same.)
I wonder: is this this an oversight (“nobody would ever do that”) or are there actual limitations that prevent it?