Title / Summary:
[BUG] DOCA GPUNetIO sample fails to initialize DPDK when CPU is bound to non-default NUMA node
Product:
DOCA SDK 3.1.0105
Component:
GPUNetIO / DOCA Flow / DPDK integration
Environment:
OS: RHEL 9.6
CPU: Dual-socket x86_64 system (2 NUMA nodes)
GPU: NVIDIA GPU attached to NUMA node 1
NIC: Mellanox NIC attached to NUMA node 1
DPDK: mlnx-dpdk 22.11 (bundled with DOCA SDK)
Hugepages: 1GB hugepages configured correctly
Description:
On a multi-NUMA system, the DOCA GPUNetIO sample application gpunetio_simple_receive fails to initialize DPDK when the process is started with any CPU NUMA binding (for example, numactl --cpunodebind=1).
When the application is executed without numactl, it runs successfully.
When CPU NUMA binding is applied (even without any memory binding), DPDK initialization fails with the following error:
EAL: Detected NUMA nodes: 1
EAL: FATAL: Invalid ‘command line’ arguments.
This indicates that the implicit DPDK EAL initialization performed internally by DOCA is not NUMA-safe and generates invalid EAL parameters when the process NUMA view differs from the default system view.
Steps to Reproduce:
- Use a system with 2 NUMA nodes.
- Ensure both the GPU and NIC are attached to NUMA node 1.
- Run the sample without NUMA binding:
./doca_gpunetio_simple_receive -n <NIC_PCI> --gpu <GPU_PCI>
Result: Application runs successfully. - Run the same sample with CPU NUMA binding:
numactl --cpunodebind=1 ./doca_gpunetio_simple_receive -n <NIC_PCI> --gpu <GPU_PCI>
Result: Application fails with DPDK EAL fatal error. - The failure occurs even when no memory binding (–membind) is used.
Expected Behavior:
The DOCA GPUNetIO sample should correctly handle CPU NUMA binding when initializing DPDK, or explicitly document that NUMA binding is unsupported, or provide a supported mechanism to control or override DPDK EAL NUMA parameters.
Impact:
This issue prevents running GPUNetIO applications with CPU affinity aligned to GPU and NIC NUMA locality, resulting in unavoidable cross-NUMA execution and measurable performance degradation on multi-socket systems.
Additional Notes:
The system NUMA topology has been verified and is correct using numactl -H.
The issue is fully reproducible and occurs consistently.
This appears to be a limitation or bug in DOCA’s implicit DPDK initialization path.