gdrcpy problem

I build gdrcpy in a compute node with Titan X 2 GPUs, 2 CPUs. After I installed gdrcpy, and run test programs. The results are followings:
$ ./validate
buffer size: 327680

$ ./copybw
testing size: 4096
rounded size: 65536
device ptr: 206c80000
closing gdrdrv

But in Nvidia announcement, the results should be

./validate buffer size: 327680 check 1: direct access + read back via cuMemcpy D->H check 2: gdr_copy_to_bar() + read back via cuMemcpy D->H check 3: gdr_copy_to_bar() + read back via gdr_copy_from_bar() check 4: gdr_copy_to_bar() + read back via gdr_copy_from_bar() + extra_dwords=5 ./copybw
testing size: 4096
rounded size: 65536
device ptr: 5046c0000
bar_ptr: 0x7f8cff410000
info.va: 5046c0000
info.mapped_size: 65536
info.page_size: 65536
page offset: 0
user-space pointer:0x7f8cff410000
BAR writing test…
BAR1 write BW: 9549.25MB/s
BAR reading test…
BAR1 read BW: 1.50172MB/s
unmapping buffer
unpinning buffer
closing gdrdrv

What is problem?

Your platform may not support GPU Direct RDMA.

For example in validate.cpp you may be hitting this assert/break:

BEGIN_CHECK {
        // tokens are optional in CUDA 6.0
        // wave out the test if GPUDirectRDMA is not enabled
        BREAK_IF_NEQ(gdr_pin_buffer(g, d_A, size, 0, 0, &mh), 0);

If you hit that, it will break out of the “BEGIN_CHECK” (do-) loop without printing any of the subsequent messages.

Some of the requirements are listed here:

https://github.com/NVIDIA/gdrcopy

Note in particular this statement there:

“GPUDirect RDMA requires an NVIDIA Tesla and Quadro class GPUs based on Kepler/Maxwell, see GPUDirect RDMA.”

You have GeForce Titan X GPUs.

If you met all those requirements, your platform may still not support GPUDirect RDMA. This is a function of platform and platform topology, which you haven’t described.