DOCA: GPU Packet Processing - Failed to start mmap DOCA Driver call failure

Hello,

I am attempting to run the DOCA GPU Packet Processing sample application but I am running into an error with mmap. The NIC of our machine is a BlueField2 and the GPU is an RTX A2000.

Someone had the same issue on this post: Doca_gpu_packet_processing Failed to start mmap DOCA Driver call failure but the solution involved resizable BAR, which is not an option for us. Is anyone aware of other alternatives?

Thank you for your help,

Here is the full error log:

sudo ./doca_gpu_packet_processing -n ca:00.0 -g b1:00.0 -q 2
[16:37:17:850011][200362][DOCA][INF][gpu_packet_processing.c:279][main] ===========================================================
[16:37:17:850084][200362][DOCA][INF][gpu_packet_processing.c:280][main] DOCA version: 2.6.0058
[16:37:17:850090][200362][DOCA][INF][gpu_packet_processing.c:281][main] ===========================================================
[16:37:17:850110][200362][DOCA][INF][gpu_packet_processing.c:302][main] Options enabled:
GPU b1:00.0
NIC ca:00.0
GPU Rx queues 2
GPU HTTP server enabled No
EAL: Detected CPU lcores: 24
EAL: Detected NUMA nodes: 2
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode ‘VA’
EAL: VFIO support initialized
TELEMETRY: No legacy callbacks, legacy socket not created
EAL: Probe PCI driver: mlx5_pci (15b3:a2d6) device: 0000:ca:00.0 (socket 1)
EAL: Probe PCI driver: gpu_cuda (10de:2531) device: 0000:b1:00.0 (socket 1)
[16:37:18:581105][200362][DOCA][WRN][engine_model.c:74][adapt_queue_depth] adapting queue depth to 128.
mlx5_net: Unable to create HWS send to kernel action
mlx5_net: [mlx5dr_cmd_header_modify_pattern_create]: Failed to create header_modify_pattern (syndrome: 0x6c1bf)
mlx5_net: [mlx5dr_pat_get_pattern]: Failed to create pattern FW object
mlx5_net: [mlx5dr_action_create_modify_header_hws]: Failed to allocate pattern for modify header
mlx5_net: [mlx5dr_action_create_nat64_copy_state]: Failed to create copy for NAT64: action_sz: 56, flags: 0x48

mlx5_net: [mlx5dr_action_create_nat64]: Nat64 failed creating copy state
mlx5_net: Cannot create NAT64 action on port 0, please check the FW version
[16:37:19:604005][200362][DOCA][INF][udp_queues.c:40][create_udp_queues] Creating UDP Eth Rxq 0
[16:37:19:605919][200362][DOCA][INF][udp_queues.c:85][create_udp_queues] Mapping receive queue buffer (0x0x7f01de000000 size 536870912B) with nvidia-peermem mode
[16:37:19:606139][200362][DOCA][ERR][linux_mapped_user_memory.cpp:75][linux_mapped_user_memory] Failed to register user memory. Got errno: Bad address
[16:37:19:606335][200362][DOCA][ERR][doca_mmap.cpp:204][priv_doca_mmap_dev_to_mkey_init_mkey] Failed to initialize mkey: failed to create memory region with exception:
[16:37:19:606379][200362][DOCA][ERR][doca_mmap.cpp:204][priv_doca_mmap_dev_to_mkey_init_mkey] DOCA exception [DOCA_ERROR_DRIVER] with message Failed to register user memory
[16:37:19:606421][200362][DOCA][ERR][doca_mmap.cpp:356][priv_doca_mmap_init_dev_to_mkey] Mmap 0x5586043e8380: Failed to initialize device=0x55860407abe0. err=DOCA_ERROR_DRIVER
[16:37:19:606447][200362][DOCA][ERR][doca_mmap.cpp:393][priv_doca_mmap_init_dev_to_mkeys] Mmap 0x5586043e8380: Failed to initialize memory range. Failed to register MR for device with id: 1. err=DOCA_ERROR_DRIVER
[16:37:19:606470][200362][DOCA][ERR][udp_queues.c:116][create_udp_queues] Failed to start mmap DOCA Driver call failure
[16:37:19:606497][200362][DOCA][INF][udp_queues.c:231][destroy_udp_queues] Destroying UDP queue 0
[16:37:19:606512][200362][DOCA][ERR][doca_ctx.cpp:235][doca_ctx_stop] Unable to stop context: Received invalid input - ctx is NULL
[16:37:19:606536][200362][DOCA][ERR][udp_queues.c:249][destroy_udp_queues] Failed doca_ctx_stop: Invalid input
[16:37:19:606561][200362][DOCA][ERR][gpu_packet_processing.c:344][main] Function create_udp_queues returned Bad State

Hi,

I was wondering if you still had this issue?
I am running into the same issue using the gpunetio_simple_receive example. I get the following similar output:

[14:18:04:277350][36339][DOCA][INF][gpunetio_simple_receive_main.c:155][main] Starting the sample
[14:18:04:575992][36339][DOCA][INF][gpunetio_simple_receive_main.c:185][main] Sample configuration:
        GPU 86:00.0
        NIC af:00.0

[14:18:04:581563][36339][DOCA][WRN][engine_model.c:90][adapt_queue_depth] adapting queue depth to 128.
EAL: Detected CPU lcores: 20
EAL: Detected NUMA nodes: 2
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
TELEMETRY: No legacy callbacks, legacy socket not created
EAL: Probe PCI driver: mlx5_pci (15b3:101d) device: 0000:af:00.0 (socket 1)
EAL: Probe PCI driver: gpu_cuda (10de:20b7) device: 0000:86:00.0 (socket 1)
[14:18:06:245065][36339][DOCA][INF][gpunetio_simple_receive_sample.c:457][create_rxq] Creating Sample Eth Rxq
[14:18:06:245490][36339][DOCA][INF][gpunetio_simple_receive_sample.c:520][create_rxq] Mapping receive queue buffer (0x0x7f3d44000000 size 33554432B dmabuf fd 134) with dmabuf mode
[14:18:06:245539][36339][DOCA][ERR][linux_devx_adapter.cpp:322][umem_reg] devx adapter 0x557bd454dd60: Failed to register user memory. Got errno=UNKNOWN-errno14 (14)
[14:18:06:245621][36339][DOCA][ERR][doca_umem.cpp:256][bridge_init_dmabuf] Failed to register dmabuf umem with exception:
[14:18:06:245639][36339][DOCA][ERR][doca_umem.cpp:256][bridge_init_dmabuf] DOCA exception [DOCA_ERROR_DRIVER] with message Failed to register user memory
[14:18:06:245662][36339][DOCA][ERR][doca_mmap.cpp:276][priv_doca_mmap_dev_to_mkey_init_mkey] Failed to initialize mkey: failed to create memory region with exception:
[14:18:06:245675][36339][DOCA][ERR][doca_mmap.cpp:276][priv_doca_mmap_dev_to_mkey_init_mkey] DOCA exception [DOCA_ERROR_DRIVER] with message priv_doca_umem constructor failed
[14:18:06:245686][36339][DOCA][ERR][doca_mmap.cpp:431][priv_doca_mmap_init_dev_to_mkey] Mmap 0x557bd4747380: Failed to initialize device=0x557bd454df80. err=DOCA_ERROR_DRIVER
[14:18:06:245697][36339][DOCA][ERR][doca_mmap.cpp:473][priv_doca_mmap_init_dev_to_mkeys] Mmap 0x557bd4747380: Failed to initialize memory range. Failed to register MR for device with id: 1. err=DOCA_ERROR_DRIVER
[14:18:06:245707][36339][DOCA][ERR][gpunetio_simple_receive_sample.c:544][create_rxq] Failed to start mmap DOCA Driver call failure
[14:18:06:245718][36339][DOCA][INF][gpunetio_simple_receive_sample.c:381][destroy_rxq] Destroying Rxq
[14:18:07:053283][36339][DOCA][ERR][gpunetio_simple_receive_sample.c:647][gpunetio_simple_receive] Function create_rxq returned Bad State
[14:18:07:053303][36339][DOCA][INF][gpunetio_simple_receive_sample.c:381][destroy_rxq] Destroying Rxq
[14:18:07:053317][36339][DOCA][ERR][gpunetio_simple_receive_sample.c:401][destroy_rxq] Failed to free gpu memory: Operation not permitted
[14:18:07:053328][36339][DOCA][ERR][gpunetio_simple_receive_sample.c:686][gpunetio_simple_receive] Function destroy_rxq returned Bad State
[14:18:07:053338][36339][DOCA][ERR][gpunetio_simple_receive_main.c:191][main] gpunetio_simple_receive() encountered an error: Bad State
[14:18:07:053349][36339][DOCA][INF][gpunetio_simple_receive_main.c:203][main] Sample finished with errors

My NIC is a ConnectX-6 DX and I use an A30 GPU. I have no issues with my BAR size.
In https://docs.nvidia.com/doca/sdk/doca+gpunetio/index.html#src-2617060674_safe-id-aWQtLkRPQ0FHUFVOZXRJT3YyLjcuMC1HUFVNZW1vcnlNYXBwaW5nKG52aWRpYS1wZWVybWVtdnMuZG1hYnVmKQ, it is written that:

Later, when calling the doca_mmap_start, the DOCA library tries to map the GPU memory buffer using the dmabuf file descriptor. If it fails (something incorrectly set on the Linux system), it fallbacks trying to map the GPU buffer with the legacy mode (nvidia-peermem ). If it fails, an informative error is returned.

In the case of my program it uses dmabuf and then returns this as ‘informative’ error.
What steps could I take to resolve this?

Hello,

I did manage to fix the issue, the problem was indeed coming from resizable BAR. We tested on another server which has resizable BAR and we managed to run the application but we are not able to receive any packet with it.

Regarding you issue, since it was not able to map the GPU buffer with nvidia-peermem, there might be an issue with the nvidia-peermem module. But I am pretty new to DOCA and nvidia so I am not really sure what the problem actually is ^^