What are the proper parameters to configure a rebuild of the HPC-X ompi for GPUDirect and RDMA?


What are the proper parameters to configure a rebuild of the HPC-X ompi for GPUDirect and RDMA to be included into the library?

In the document “https://www.open-mpi.org/faq/?category=runcuda”, under “GPUDirect RDMA Information”, there are commands given to see if you have GPUDirect RDMA compiled into your library:

$ ompi_info --all | grep btl_openib_have_cuda_gdr


ompi_info --all | grep btl_openib_have_driver_gdr

Neither of those result in “true” (the flags don’t even appear to be present).

In the document “http://www.mellanox.com/related-docs/prod_software/Mellanox_GPUDirect_User_Manual.pdf”, you give a configuration string for recompiling openmpi for running GPUDirect RDMA:

./configure --prefix=/path/to/openmpi-1.10.0_cuda7.0 --with-wrapper-ldflags=W1,-rpath,/lib --disable-vt --enable-orterun-prefix-by-default -disable-io-romio --eanble-picky --with-cuda=/usr/local/cuda-7.0

It appears that the “–with-cuda” is the key there, even though the versions are out of date. This is also against a trunk build of openmpi.

I have gpudirect and gdrcopy both properly installed.

In both the default HPC-X installation and in my build the config.status file shows “mpi_build_with_cuda_support” as true.

Is this just a mix of out of date information?

Does that flag (“mpi_build_with_cuda_support”) correlate to GPUDirect and RDMA being corectly configured into the ompi build?

How/where does one verify that?

The HPC-X docs discuss it, but it ends up just giving some generics:



Hello Andrew,

Many thanks for posting your question on the Mellanox Community. As you have a valid support contract, you can also open a Mellanox Support ticket by sending an email to supportadmin@mellanox.com.

For now, I opened a new ticket for you and one of our engineers will assist you shortly with your request.

Many thanks,

~Mellanox Technical Support

In the (latest) Mellanox_GPUDIrect_User_Manual (1.6):

% mpirun -mca btl_openib_want_cuda_gdr 1 -np 2 -npernode 1 -mca btl_openib_if_include mlx5_0:1 -bind-to-core -cpu-set 19 -x CUDA_VISIBLE_DEVICES=0 /path/to/osu-benchmarks/osu_latency -d cuda D D

So the flags:



Appear to be set them at runtime, though I don’t know how you know if its actually being used. The OpenMPI docs and the Mellanox docs are a little different. Running the benchmark it appears to accept either one, but no indication from the benchmark as to what its really using. It accepts garbage without even a warning.