Run DOCA GPUNETIO On CX7 and H200

HI I am trying to run DOCA GpunetIO Sample program doca_gpunetio_rdma_client_server_write
on CX7 and H200

Below is my configs

85:00.0 Ethernet controller: Mellanox Technologies MT2910 Family [ConnectX-7]
e3:00.0 3D controller: NVIDIA Corporation GH100 [H200 NVL]

lspci -s e3:00.0 -n
e3:00.0 0302: 10de:233b (rev a1)

$ ethtool -i ens1006np0
driver: mlx5_core
version: 6.15.0
firmware-version: 28.43.2026 (MT_0000000838)

~$ nvidia-smi -q | grep BAR1 -A 3
BAR1 Memory Usage
Total : 262144 MiB
Used : 2 MiB
Free : 262142 MiB

Linux kernel 6.15
kernel command Line - amd_iommu=off hugepagesz=1G hugepages=128

/opt/mellanox/doca/samples/doca_gpunetio/gpunetio_rdma_client_server_write/build$ ./doca_gpunetio_rdma_client_server_write -d rocep133s0 -gpu e3:00.0  -l 70 --sdk-log-level 70
[10:15:25:022512][2714272][DOCA][INF][gpunetio_rdma_client_server_write_main.c:461][main] Starting the sample
[10:15:25:024015][2714272][DOCA][INF][doca_dev.cpp:622][doca_devinfo_create_list] Devinfo list 0x5d601ce82598: Added device=0x5d601ce76cd0 to devinfo list
[10:15:25:024030][2714272][DOCA][INF][doca_dev.cpp:631][doca_devinfo_create_list] Devinfo list 0x5d601ce82598 was created
[10:15:25:030097][2714272][DOCA][INF][linux_devx_adapter.cpp:96][open] devx adapter 0x5d601ce7dba0: opened ibv_ctx 0x5d601ce7dc80
[10:15:25:030711][2714272][DOCA][INF][doca_dev.cpp:993][doca_dev_open] Local device 0x5d601ce76cd0 was opened
[10:15:25:030719][2714272][DOCA][INF][doca_dev.cpp:712][doca_devinfo_destroy_list] Devinfo list 0x5d601ce82598 was destroyed
[10:15:25:758112][2714272][DOCA][DBG][doca_gpunetio.cpp:149][doca_gpu_create] GPU needs flush 0
[10:15:25:758130][2714272][DOCA][INF][doca_sub_dev.cpp:45][priv_doca_sub_dev_gpu_ops_set] sub_dev: gpu_ops was set to 0x700dc7502cc0
[10:15:25:758343][2714272][DOCA][DBG][priv_doca_rdma.cpp:356][priv_doca_rdma_get_max_send_data_segs] devinfo 0x5d601ce76d30: max_send_data_segs=64 for transport_type=RC
[10:15:25:758352][2714272][DOCA][DBG][priv_doca_rdma.cpp:356][priv_doca_rdma_get_max_send_data_segs] devinfo 0x5d601ce76d30: max_send_data_segs=16 for transport_type=DC
[10:15:25:758354][2714272][DOCA][DBG][priv_doca_rdma.cpp:315][priv_doca_rdma_get_max_recv_data_segs] devinfo 0x5d601ce76d30: max_recv_data_segs=32
[10:15:25:758782][2714272][DOCA][INF][doca_rdma.cpp:262][doca_rdma_create] RDMA=0x700dc6ade010 was created, dev=0x5d601ce76cd0
[10:15:25:758788][2714272][DOCA][INF][doca_rdma.cpp:1773][doca_rdma_set_permissions] RDMA 0x700dc6ade010: permissions were set to 0x5
[10:15:25:758790][2714272][DOCA][INF][doca_rdma.cpp:1567][doca_rdma_set_send_queue_size] RDMA 0x700dc6ade010: send_queue_size was set to 8192
[10:15:25:758794][2714272][DOCA][INF][doca_rdma.cpp:1611][doca_rdma_set_recv_queue_size] RDMA 0x700dc6ade010: recv_queue_size was set to 8192
[10:15:25:758798][2714272][DOCA][INF][doca_rdma.cpp:1801][doca_rdma_set_grh_enabled] RDMA 0x700dc6ade010: grh_enabled was set to 1
[10:15:25:758800][2714272][DOCA][INF][doca_ctx.cpp:185][adjust_ctx_ops_to_data_path] CTX 0x700dc6ade010: ctx ops set to GPU ops
[10:15:25:758804][2714272][DOCA][INF][doca_ctx.cpp:245][doca_ctx_start] CTX 0x700dc6ade010 does not require PE
[10:15:25:758994][2714272][DOCA][INF][doca_uar.cpp:233][bridge_init] UAR 0x5d601ceb1e00 created: page=0x700dc6e20000, reg_addr=0x700dc6e20800, base_addr=0x700dc6e20000, id=259, alloc_type=NONCACHE_DEDICATED
[10:15:25:853441][2714272][DOCA][DBG][doca_gpunetio.cpp:347][doca_gpu_mem_alloc] New memory: Orig 700ba7e00000 GPU 700ba7e00000 CPU 0 type 0 size 1052672

[10:15:25:853493][2714272][DOCA][ERR][linux_devx_adapter.cpp:250][umem_reg] devx adapter 0x5d601ce7dba0: Failed to register user memory. Got errno=UNKNOWN-errno14 (14)
[10:15:25:853552][2714272][DOCA][ERR][doca_umem.cpp:345][bridge_init] Failed to register umem with exception:
[10:15:25:853562][2714272][DOCA][ERR][doca_umem.cpp:345][bridge_init] DOCA exception [DOCA_ERROR_DRIVER] with message Failed to register user memory
[10:15:25:853572][2714272][DOCA][ERR][doca_umem.cpp:92][priv_doca_umem_create_pg_sz] Failed to create umem with page size with exception:
[10:15:25:853577][2714272][DOCA][ERR][doca_umem.cpp:92][priv_doca_umem_create_pg_sz] DOCA exception [DOCA_ERROR_DRIVER] with message priv_doca_umem constructor failed
[10:15:25:853582][2714272][DOCA][ERR][rdma_ctx_gpu.cpp:154][priv_doca_rdma_gpu_create_cq] RDMA 0x700dc6ade010: Failed to create UMEM GPU. err=DOCA_ERROR_DRIVER
[10:15:25:853701][2714272][DOCA][DBG][doca_gpunetio.cpp:347][doca_gpu_mem_alloc] New memory: Orig 700ba7e00000 GPU 700ba7e00000 CPU 0 type 0 size 1052672

[10:15:25:853709][2714272][DOCA][ERR][linux_devx_adapter.cpp:250][umem_reg] devx adapter 0x5d601ce7dba0: Failed to register user memory. Got errno=UNKNOWN-errno14 (14)
[10:15:25:853713][2714272][DOCA][ERR][doca_umem.cpp:345][bridge_init] Failed to register umem with exception:
[10:15:25:853715][2714272][DOCA][ERR][doca_umem.cpp:345][bridge_init] DOCA exception [DOCA_ERROR_DRIVER] with message Failed to register user memory
[10:15:25:853718][2714272][DOCA][ERR][doca_umem.cpp:92][priv_doca_umem_create_pg_sz] Failed to create umem with page size with exception:
[10:15:25:853720][2714272][DOCA][ERR][doca_umem.cpp:92][priv_doca_umem_create_pg_sz] DOCA exception [DOCA_ERROR_DRIVER] with message priv_doca_umem constructor failed
[10:15:25:853722][2714272][DOCA][ERR][rdma_ctx_gpu.cpp:154][priv_doca_rdma_gpu_create_cq] RDMA 0x700dc6ade010: Failed to create UMEM GPU. err=DOCA_ERROR_DRIVER
[10:15:25:853805][2714272][DOCA][DBG][doca_gpunetio.cpp:347][doca_gpu_mem_alloc] New memory: Orig 700ba7e00000 GPU 700ba7e00000 CPU 0 type 0 size 135168

[10:15:25:853811][2714272][DOCA][ERR][linux_devx_adapter.cpp:250][umem_reg] devx adapter 0x5d601ce7dba0: Failed to register user memory. Got errno=UNKNOWN-errno14 (14)
[10:15:25:853814][2714272][DOCA][ERR][doca_umem.cpp:345][bridge_init] Failed to register umem with exception:
[10:15:25:853816][2714272][DOCA][ERR][doca_umem.cpp:345][bridge_init] DOCA exception [DOCA_ERROR_DRIVER] with message Failed to register user memory
[10:15:25:853818][2714272][DOCA][ERR][doca_umem.cpp:92][priv_doca_umem_create_pg_sz] Failed to create umem with page size with exception:
[10:15:25:853820][2714272][DOCA][ERR][doca_umem.cpp:92][priv_doca_umem_create_pg_sz] DOCA exception [DOCA_ERROR_DRIVER] with message priv_doca_umem constructor failed
[10:15:25:853822][2714272][DOCA][ERR][rdma_ctx_gpu.cpp:662][priv_doca_rdma_gpu_create_rmp] Failed to create UMEM for RMP. err=DOCA_ERROR_DRIVER
[10:15:25:853884][2714272][DOCA][ERR][rdma_ctx_gpu.cpp:877][priv_doca_rdma_gpu_create_connection_objects] RDMA 0x700dc6ade010: Failed to create RMP. err=DOCA_ERROR_DRIVER
[10:15:25:853892][2714272][DOCA][ERR][rdma_ctx_gpu.cpp:1150][priv_doca_rdma_ctx_gpu_start] RDMA 0x700dc6ade010: Failed to start ctx: Failed to create connection objects. err=DOCA_ERROR_DRIVER
[10:15:25:853943][2714272][DOCA][ERR][doca_ctx.cpp:269][doca_ctx_start] Failed to start context 0x7fffc0c3d618 with status DOCA_ERROR_DRIVER
[10:15:25:853950][2714272][DOCA][ERR][rdma_common.c:428][create_rdma_resources] Failed to start RDMA context: DOCA Driver call failure
[10:15:25:853955][2714272][DOCA][INF][doca_pe.cpp:115][priv_doca_pe_ctx_destroy] Destroying progress engine ctx=0x700dc6ade010
[10:15:25:853958][2714272][DOCA][INF][doca_rdma.cpp:307][doca_rdma_destroy] RDMA 0x700dc6ade010: RDMA was destroyed
[10:15:25:855030][2714272][DOCA][INF][doca_dev.cpp:143][dev_put] Device 0x5d601ce76cd0 was destroyed
[10:15:25:855037][2714272][DOCA][INF][doca_dev.cpp:1008][doca_dev_close] Local device 0x5d601ce76cd0 was closed
[10:15:25:855040][2714272][DOCA][ERR][gpunetio_rdma_client_server_write_sample.c:571][rdma_write_server] Failed to allocate RDMA resources: DOCA Driver call failure
[10:15:25:855044][2714272][DOCA][ERR][gpunetio_rdma_client_server_write_main.c:495][main] rdma_write_server() failed: DOCA Driver call failure
[10:15:25:855051][2714272][DOCA][INF][gpunetio_rdma_client_server_write_main.c:514][main] Sample finished with errors

My questions
0.How to resolve failures
1.What are the modules and library need to run this sample?
2.Configs required
3.Do we need nvidia-peermem , its latest kernel and cuda 12.9 ( open source nvidia driver)
so guess dmabuf should work

Note: Doca host package is too big , so i have selectively isntalled below packages

doca-perftest_1.0.1_amd64.deb
doca-samples_3.0.0058-1_amd64.deb
libdoca-sdk-comch-dev_3.0.0058-1_amd64.deb
doca-sdk-eth_3.0.0058-1_amd64.deb
libdoca-sdk-gpunetio-dev_3.0.0058-1_amd64.deb
doca-sdk-gpunetio_3.0.0058-1_amd64.deb
mlnx-dpdk-dev_22.11.0-2504.1.0.2504061_amd64.deb
rdma-core_2501mlnx56-1.2504061_amd64.deb
perftest_25.04.0-0.84.g97da83e.2504061_amd64.deb
mlnx-tools_25.01-0.2504061_amd64.deb
mlnx-ofed-kernel-utils_25.04.OFED.25.04.0.6.1.1-1_amd64.deb
mlnx-dpdk_22.11.0-2504.1.0.2504061_amd64.deb
doca-sdk-common_3.0.0058-1_amd64.deb
doca-sdk-rdma_3.0.0058-1_amd64.deb
ibacm_2501mlnx56-1.2504061_amd64.deb
ibutils2_2.1.1-0.22200.MLNX20250423.g91730569c.2504061_amd64.deb
ibverbs-providers_2501mlnx56-1.2504061_amd64.deb
infiniband-diags_2501mlnx56-1.2504061_amd64.deb
libdoca-sdk-common-dev_3.0.0058-1_amd64.deb
libdoca-sdk-argp-dev_3.0.0058-1_amd64.deb
libdoca-sdk-eth-dev_3.0.0058-1_amd64.deb
libdoca-sdk-dpdk-bridge-dev_3.0.0058-1_amd64.deb
libdoca-sdk-flow-dev_3.0.0058-1_amd64.deb
libdoca-sdk-rdma-dev_3.0.0058-1_amd64.deb

Any update on this query

Did you thoroughly review DOCA GPUNetIO - NVIDIA Docs “DOCA GPUNetIOP for overview & configuration.

System Configuration

DOCA GPUNetIO components

Validate GPUDirect RDMA is enabled

nvidia-smi topo -m

(Ensure the NIC ↔ GPU link is marked as PHB or PIX, not SYS)

Check if GPUDirect RDMA kernel modules are loaded

lsmod | grep nv_peer_mem

(If you’re using upstream drivers (not MLNX_OFED), nv_peer_mem might not be required, but ensure that your stack supports GPUDirect RDMA)à Recommendation is to use full DOCA (select respective profile) not partial packages. Tests are performed with the full installation of DOCA driver.

Verify your DOCA and GPU driver stack are compatible.

@spruitt
1.How to validate GPUDirect RDMA is enabled , i guess its default feature of CX7 ( Uiltizing nv_peer_mem or dmabuf( using upstream)) and NVIDIA Open source GPU driver ( 575) supports dmabuf
2.NIC/GPU are in same PCIe Switch ( PIX)
3.Can you confirm if DOCA OFED is not required for GDA KI , i want to use GDA KI so which profile minimal should i use DOCA Roce or DOCA OFed or DOCA Networking?
4. I am using custom kernel (6.14) , how we can use DOCA in that case?
5,Regarding - " Verify your DOCA and GPU driver stack"
How to confirm that?