gdrcpy problem

xinchen · February 25, 2016, 8:03pm

I build gdrcpy in a compute node with Titan X 2 GPUs, 2 CPUs. After I installed gdrcpy, and run test programs. The results are followings:
$ ./validate
buffer size: 327680

$ ./copybw
testing size: 4096
rounded size: 65536
device ptr: 206c80000
closing gdrdrv

But in Nvidia announcement, the results should be

$ ./validate
buffer size: 327680
check 1: direct access + read back via cuMemcpy D->H
check 2: gdr_copy_to_bar() + read back via cuMemcpy D->H
check 3: gdr_copy_to_bar() + read back via gdr_copy_from_bar()
check 4: gdr_copy_to_bar() + read back via gdr_copy_from_bar() + extra_dwords=5
$ ./copybw
testing size: 4096
rounded size: 65536
device ptr: 5046c0000
bar_ptr: 0x7f8cff410000
info.va: 5046c0000
info.mapped_size: 65536
info.page_size: 65536
page offset: 0
user-space pointer:0x7f8cff410000
BAR writing test…
BAR1 write BW: 9549.25MB/s
BAR reading test…
BAR1 read BW: 1.50172MB/s
unmapping buffer
unpinning buffer
closing gdrdrv

What is problem?

Robert_Crovella · February 25, 2016, 8:44pm

Your platform may not support GPU Direct RDMA.

For example in validate.cpp you may be hitting this assert/break:

BEGIN_CHECK {
        // tokens are optional in CUDA 6.0
        // wave out the test if GPUDirectRDMA is not enabled
        BREAK_IF_NEQ(gdr_pin_buffer(g, d_A, size, 0, 0, &mh), 0);

If you hit that, it will break out of the “BEGIN_CHECK” (do-) loop without printing any of the subsequent messages.

Some of the requirements are listed here:

https://github.com/NVIDIA/gdrcopy

Note in particular this statement there:

“GPUDirect RDMA requires an NVIDIA Tesla and Quadro class GPUs based on Kepler/Maxwell, see GPUDirect RDMA.”

You have GeForce Titan X GPUs.

If you met all those requirements, your platform may still not support GPUDirect RDMA. This is a function of platform and platform topology, which you haven’t described.

Topic		Replies	Views
Trying to get GPUdirect RDMA working. CUDA Setup and Installation	2	1622	April 10, 2014
GPUDirect RDMA support with CUDA 5 CUDA Programming and Performance	19	9246	May 28, 2013
Can anyone tell me if a PCIe device can copy directly into GPU GPU-Accelerated Libraries	0	688	January 24, 2014
Will GPUDirect RDMA works for cuda 6.0 and above? CUDA Programming and Performance	0	479	December 29, 2015
GPUDirect RDMA performance CUDA Programming and Performance	2	2197	March 26, 2013
GPUDirect memory pinning possible on Fermi? CUDA Programming and Performance	3	1363	November 29, 2012
Is there any where I can download the sample code for GPUDirect RDMA? RDMA Software For GPU	3	1354	February 12, 2016
gpudirect v2 and MPI CUDA Programming and Performance	0	1244	July 19, 2011
RDMA GPUDirect//nvidia-peer-memory/cuda issue RDMA Software For GPU software-and-drivers , howto-enable-verify-and-troubleshoo	11	2266	September 12, 2019
GPUDirect released CUDA Programming and Performance	0	6214	September 1, 2010

gdrcpy problem

Related topics