I’ve just installed 2 GTX 1080ti on Threadripper 1950x. However, if I run the P2P benchmarks provided by cuda’s sample (such as simpleP2P, p2pBandwidthLatencyTest), they crash.
The cause should be caused by the following function call:
cudaMemcpy(g1, g0, buf_size, cudaMemcpyDefault)
And g0 and g1 are defined as:
float *g0;
checkCudaErrors(cudaMalloc(&g0, buf_size));
float *g1;
checkCudaErrors(cudaMalloc(&g1, buf_size));
I’ve also enabled AMD-vi and IOMMU, but it still does not work. Does this mean that cuda’s UVA can only work on Intel platform?
Sorry, I made a mistake, the codes that caused the problem is:
printf("Run kernel on GPU%d, taking source data from GPU%d and writing to GPU%d...\n",
gpuid[0], gpuid[1], gpuid[0]);
checkCudaErrors(cudaSetDevice(gpuid[0]));
SimpleKernel<<<blocks, threads>>>(g1, g0);
checkCudaErrors(cudaDeviceSynchronize());
Similar problem here. We replaced a few of our old Intel Xeon nodes with Threadripper 1920 systems with two TitanX GPUs. P2P transfers in MXnet fail. Syslog showing thousands of entries like this:
I’ve solved the problem and posted it in another thread with the same title. The sollution is to diable IOMMU in bios settings. Nvidia has its own memory managing mechanism.