HI I am trying to run DOCA GpunetIO Sample program doca_gpunetio_rdma_client_server_write
on CX7 and H200
Below is my configs
85:00.0 Ethernet controller: Mellanox Technologies MT2910 Family [ConnectX-7]
e3:00.0 3D controller: NVIDIA Corporation GH100 [H200 NVL]
lspci -s e3:00.0 -n
e3:00.0 0302: 10de:233b (rev a1)
$ ethtool -i ens1006np0
driver: mlx5_core
version: 6.15.0
firmware-version: 28.43.2026 (MT_0000000838)
~$ nvidia-smi -q | grep BAR1 -A 3
BAR1 Memory Usage
Total : 262144 MiB
Used : 2 MiB
Free : 262142 MiB
Linux kernel 6.15
kernel command Line - amd_iommu=off hugepagesz=1G hugepages=128
/opt/mellanox/doca/samples/doca_gpunetio/gpunetio_rdma_client_server_write/build$ ./doca_gpunetio_rdma_client_server_write -d rocep133s0 -gpu e3:00.0 -l 70 --sdk-log-level 70
[10:15:25:022512][2714272][DOCA][INF][gpunetio_rdma_client_server_write_main.c:461][main] Starting the sample
[10:15:25:024015][2714272][DOCA][INF][doca_dev.cpp:622][doca_devinfo_create_list] Devinfo list 0x5d601ce82598: Added device=0x5d601ce76cd0 to devinfo list
[10:15:25:024030][2714272][DOCA][INF][doca_dev.cpp:631][doca_devinfo_create_list] Devinfo list 0x5d601ce82598 was created
[10:15:25:030097][2714272][DOCA][INF][linux_devx_adapter.cpp:96][open] devx adapter 0x5d601ce7dba0: opened ibv_ctx 0x5d601ce7dc80
[10:15:25:030711][2714272][DOCA][INF][doca_dev.cpp:993][doca_dev_open] Local device 0x5d601ce76cd0 was opened
[10:15:25:030719][2714272][DOCA][INF][doca_dev.cpp:712][doca_devinfo_destroy_list] Devinfo list 0x5d601ce82598 was destroyed
[10:15:25:758112][2714272][DOCA][DBG][doca_gpunetio.cpp:149][doca_gpu_create] GPU needs flush 0
[10:15:25:758130][2714272][DOCA][INF][doca_sub_dev.cpp:45][priv_doca_sub_dev_gpu_ops_set] sub_dev: gpu_ops was set to 0x700dc7502cc0
[10:15:25:758343][2714272][DOCA][DBG][priv_doca_rdma.cpp:356][priv_doca_rdma_get_max_send_data_segs] devinfo 0x5d601ce76d30: max_send_data_segs=64 for transport_type=RC
[10:15:25:758352][2714272][DOCA][DBG][priv_doca_rdma.cpp:356][priv_doca_rdma_get_max_send_data_segs] devinfo 0x5d601ce76d30: max_send_data_segs=16 for transport_type=DC
[10:15:25:758354][2714272][DOCA][DBG][priv_doca_rdma.cpp:315][priv_doca_rdma_get_max_recv_data_segs] devinfo 0x5d601ce76d30: max_recv_data_segs=32
[10:15:25:758782][2714272][DOCA][INF][doca_rdma.cpp:262][doca_rdma_create] RDMA=0x700dc6ade010 was created, dev=0x5d601ce76cd0
[10:15:25:758788][2714272][DOCA][INF][doca_rdma.cpp:1773][doca_rdma_set_permissions] RDMA 0x700dc6ade010: permissions were set to 0x5
[10:15:25:758790][2714272][DOCA][INF][doca_rdma.cpp:1567][doca_rdma_set_send_queue_size] RDMA 0x700dc6ade010: send_queue_size was set to 8192
[10:15:25:758794][2714272][DOCA][INF][doca_rdma.cpp:1611][doca_rdma_set_recv_queue_size] RDMA 0x700dc6ade010: recv_queue_size was set to 8192
[10:15:25:758798][2714272][DOCA][INF][doca_rdma.cpp:1801][doca_rdma_set_grh_enabled] RDMA 0x700dc6ade010: grh_enabled was set to 1
[10:15:25:758800][2714272][DOCA][INF][doca_ctx.cpp:185][adjust_ctx_ops_to_data_path] CTX 0x700dc6ade010: ctx ops set to GPU ops
[10:15:25:758804][2714272][DOCA][INF][doca_ctx.cpp:245][doca_ctx_start] CTX 0x700dc6ade010 does not require PE
[10:15:25:758994][2714272][DOCA][INF][doca_uar.cpp:233][bridge_init] UAR 0x5d601ceb1e00 created: page=0x700dc6e20000, reg_addr=0x700dc6e20800, base_addr=0x700dc6e20000, id=259, alloc_type=NONCACHE_DEDICATED
[10:15:25:853441][2714272][DOCA][DBG][doca_gpunetio.cpp:347][doca_gpu_mem_alloc] New memory: Orig 700ba7e00000 GPU 700ba7e00000 CPU 0 type 0 size 1052672
[10:15:25:853493][2714272][DOCA][ERR][linux_devx_adapter.cpp:250][umem_reg] devx adapter 0x5d601ce7dba0: Failed to register user memory. Got errno=UNKNOWN-errno14 (14)
[10:15:25:853552][2714272][DOCA][ERR][doca_umem.cpp:345][bridge_init] Failed to register umem with exception:
[10:15:25:853562][2714272][DOCA][ERR][doca_umem.cpp:345][bridge_init] DOCA exception [DOCA_ERROR_DRIVER] with message Failed to register user memory
[10:15:25:853572][2714272][DOCA][ERR][doca_umem.cpp:92][priv_doca_umem_create_pg_sz] Failed to create umem with page size with exception:
[10:15:25:853577][2714272][DOCA][ERR][doca_umem.cpp:92][priv_doca_umem_create_pg_sz] DOCA exception [DOCA_ERROR_DRIVER] with message priv_doca_umem constructor failed
[10:15:25:853582][2714272][DOCA][ERR][rdma_ctx_gpu.cpp:154][priv_doca_rdma_gpu_create_cq] RDMA 0x700dc6ade010: Failed to create UMEM GPU. err=DOCA_ERROR_DRIVER
[10:15:25:853701][2714272][DOCA][DBG][doca_gpunetio.cpp:347][doca_gpu_mem_alloc] New memory: Orig 700ba7e00000 GPU 700ba7e00000 CPU 0 type 0 size 1052672
[10:15:25:853709][2714272][DOCA][ERR][linux_devx_adapter.cpp:250][umem_reg] devx adapter 0x5d601ce7dba0: Failed to register user memory. Got errno=UNKNOWN-errno14 (14)
[10:15:25:853713][2714272][DOCA][ERR][doca_umem.cpp:345][bridge_init] Failed to register umem with exception:
[10:15:25:853715][2714272][DOCA][ERR][doca_umem.cpp:345][bridge_init] DOCA exception [DOCA_ERROR_DRIVER] with message Failed to register user memory
[10:15:25:853718][2714272][DOCA][ERR][doca_umem.cpp:92][priv_doca_umem_create_pg_sz] Failed to create umem with page size with exception:
[10:15:25:853720][2714272][DOCA][ERR][doca_umem.cpp:92][priv_doca_umem_create_pg_sz] DOCA exception [DOCA_ERROR_DRIVER] with message priv_doca_umem constructor failed
[10:15:25:853722][2714272][DOCA][ERR][rdma_ctx_gpu.cpp:154][priv_doca_rdma_gpu_create_cq] RDMA 0x700dc6ade010: Failed to create UMEM GPU. err=DOCA_ERROR_DRIVER
[10:15:25:853805][2714272][DOCA][DBG][doca_gpunetio.cpp:347][doca_gpu_mem_alloc] New memory: Orig 700ba7e00000 GPU 700ba7e00000 CPU 0 type 0 size 135168
[10:15:25:853811][2714272][DOCA][ERR][linux_devx_adapter.cpp:250][umem_reg] devx adapter 0x5d601ce7dba0: Failed to register user memory. Got errno=UNKNOWN-errno14 (14)
[10:15:25:853814][2714272][DOCA][ERR][doca_umem.cpp:345][bridge_init] Failed to register umem with exception:
[10:15:25:853816][2714272][DOCA][ERR][doca_umem.cpp:345][bridge_init] DOCA exception [DOCA_ERROR_DRIVER] with message Failed to register user memory
[10:15:25:853818][2714272][DOCA][ERR][doca_umem.cpp:92][priv_doca_umem_create_pg_sz] Failed to create umem with page size with exception:
[10:15:25:853820][2714272][DOCA][ERR][doca_umem.cpp:92][priv_doca_umem_create_pg_sz] DOCA exception [DOCA_ERROR_DRIVER] with message priv_doca_umem constructor failed
[10:15:25:853822][2714272][DOCA][ERR][rdma_ctx_gpu.cpp:662][priv_doca_rdma_gpu_create_rmp] Failed to create UMEM for RMP. err=DOCA_ERROR_DRIVER
[10:15:25:853884][2714272][DOCA][ERR][rdma_ctx_gpu.cpp:877][priv_doca_rdma_gpu_create_connection_objects] RDMA 0x700dc6ade010: Failed to create RMP. err=DOCA_ERROR_DRIVER
[10:15:25:853892][2714272][DOCA][ERR][rdma_ctx_gpu.cpp:1150][priv_doca_rdma_ctx_gpu_start] RDMA 0x700dc6ade010: Failed to start ctx: Failed to create connection objects. err=DOCA_ERROR_DRIVER
[10:15:25:853943][2714272][DOCA][ERR][doca_ctx.cpp:269][doca_ctx_start] Failed to start context 0x7fffc0c3d618 with status DOCA_ERROR_DRIVER
[10:15:25:853950][2714272][DOCA][ERR][rdma_common.c:428][create_rdma_resources] Failed to start RDMA context: DOCA Driver call failure
[10:15:25:853955][2714272][DOCA][INF][doca_pe.cpp:115][priv_doca_pe_ctx_destroy] Destroying progress engine ctx=0x700dc6ade010
[10:15:25:853958][2714272][DOCA][INF][doca_rdma.cpp:307][doca_rdma_destroy] RDMA 0x700dc6ade010: RDMA was destroyed
[10:15:25:855030][2714272][DOCA][INF][doca_dev.cpp:143][dev_put] Device 0x5d601ce76cd0 was destroyed
[10:15:25:855037][2714272][DOCA][INF][doca_dev.cpp:1008][doca_dev_close] Local device 0x5d601ce76cd0 was closed
[10:15:25:855040][2714272][DOCA][ERR][gpunetio_rdma_client_server_write_sample.c:571][rdma_write_server] Failed to allocate RDMA resources: DOCA Driver call failure
[10:15:25:855044][2714272][DOCA][ERR][gpunetio_rdma_client_server_write_main.c:495][main] rdma_write_server() failed: DOCA Driver call failure
[10:15:25:855051][2714272][DOCA][INF][gpunetio_rdma_client_server_write_main.c:514][main] Sample finished with errors
My questions
0.How to resolve failures
1.What are the modules and library need to run this sample?
2.Configs required
3.Do we need nvidia-peermem , its latest kernel and cuda 12.9 ( open source nvidia driver)
so guess dmabuf should work
Note: Doca host package is too big , so i have selectively isntalled below packages
doca-perftest_1.0.1_amd64.deb
doca-samples_3.0.0058-1_amd64.deb
libdoca-sdk-comch-dev_3.0.0058-1_amd64.deb
doca-sdk-eth_3.0.0058-1_amd64.deb
libdoca-sdk-gpunetio-dev_3.0.0058-1_amd64.deb
doca-sdk-gpunetio_3.0.0058-1_amd64.deb
mlnx-dpdk-dev_22.11.0-2504.1.0.2504061_amd64.deb
rdma-core_2501mlnx56-1.2504061_amd64.deb
perftest_25.04.0-0.84.g97da83e.2504061_amd64.deb
mlnx-tools_25.01-0.2504061_amd64.deb
mlnx-ofed-kernel-utils_25.04.OFED.25.04.0.6.1.1-1_amd64.deb
mlnx-dpdk_22.11.0-2504.1.0.2504061_amd64.deb
doca-sdk-common_3.0.0058-1_amd64.deb
doca-sdk-rdma_3.0.0058-1_amd64.deb
ibacm_2501mlnx56-1.2504061_amd64.deb
ibutils2_2.1.1-0.22200.MLNX20250423.g91730569c.2504061_amd64.deb
ibverbs-providers_2501mlnx56-1.2504061_amd64.deb
infiniband-diags_2501mlnx56-1.2504061_amd64.deb
libdoca-sdk-common-dev_3.0.0058-1_amd64.deb
libdoca-sdk-argp-dev_3.0.0058-1_amd64.deb
libdoca-sdk-eth-dev_3.0.0058-1_amd64.deb
libdoca-sdk-dpdk-bridge-dev_3.0.0058-1_amd64.deb
libdoca-sdk-flow-dev_3.0.0058-1_amd64.deb
libdoca-sdk-rdma-dev_3.0.0058-1_amd64.deb