Mellanox Infiniband + mmap() + MPI one-sided communication fails when DAPL UD enabled

Hi!

I used a trick in order to read a page located in a remote machine’s disk.

(using mmap() over the whole file in each machine and creating MPI_one_sided communication windows on it)

It works fine when DAPL UD disabled but it spits the following error messages if I enable DAPL UD by setting ‘I_MPI_DAPL_UD=1’.

XXX001:UCM:1d1a:84d2ab40: 271380 us(271380 us): DAPL ERR reg_mr Cannot allocate memory

[0:XXX001] rtc_register failed 196608 [0] error(0x30000): unknown error

Assertion failed in file …/…/src/mpid/ch3/channels/nemesis/netmod/dapl/dapl_send_ud.c at line 1468: 0

internal ABORT - process 0

XXX002:UCM:31e2:27bacb40: 263683 us(263683 us): DAPL ERR reg_mr Cannot allocate memory

[1:XXX002] rtc_register failed 196608 [1] error(0x30000): unknown error

Assertion failed in file …/…/src/mpid/ch3/channels/nemesis/netmod/dapl/dapl_send_ud.c at line 1468: 0

Pleased refer to the attached file for the code I used.

and I ran above program with following flags enabled:

export I_MPI_FABRICS=dapl

export I_MPI_DAPL_UD=1

command: mpiexec.hydra -genvall -machinefile ~/machines -n 2 -ppn 1 ${PWD}/test2

Here are my general questions:

(1) When the window over mmaped region is created, does the ib driver try to pin the whole memory region to prevent page faults?

(2) Is the behavior when ib driver tries to register the memory region different depending on whether DAPL UD enabled/disabled?

Experimental Environment:

Hardware Spec:

OS : CentOS 6.4 Final

CPU : 2 * Intel® Xeon® CPU E5-2450 @ (2.10GHz, 8 physical cores)

RAM : 32GB per each

Ethernet: InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE]

Mellanox Infiniband driver: MLNX_OFED_LINUX-3.1-1.1.0.1 (OFED-3.1-1.1.0): 3.19.0

thanks,

MOFED does have openmpi that use UD QP by the default, or you can try HPC-X Toolkit from Mellanox site, that available for different OSes.

Check ulimits (Environment Problems | Intel® Developer Zone https://software.intel.com/en-us/node/561768 ) and maybe limits.conf

Note sure if this is any help:

Have you tried to file it as a bug to the Intel MPI team?

Does the same thing happen if you use another MPI implementation?