Network filtering on Jetson using DOCA GPUNetIO / GPUDirect RDMA

Hi, I am trying to implement CUDA-accelerated network filtering on a Jetson device. I have both an Orin NX 16GB and an AGX Orin 64GB Developer Kit available to me.

I found several articles about DOCA GPUNetIO, which seems to be exactly what I need. However, as far as I understand, DOCA only works on BlueField devices. Is that the case or can I use it on Jetson as well?

DOCA is based on GPUDirect RDMA technology, which should be available on Jetson platforms (cudaGetDeviceProperties → gpuDirectRDMASupported reports 1). It would be fine for me to use GPUDirect RDMA directly without the DOCA framework, should that be necessary.

On my Orin NX an “Aquantia Corp.” 10Gbit/s network adapter is installed. However, I do not know whether this device supports RDMA or not. How can I find this information?

My AGX Orin has no additional NIC as of yet. I could install whatever NIC is suitable for my use-case. What adapter would you recommend for this?

Finally, is what I am trying to do even viable on a Jetson?
How can I get started without too much setup overhead?

Any help would be highly appreciated!

Dear Stefan!
Welcome to NVIDIA Developers community
regarding your questions:

Implementing CUDA-accelerated network filtering on Jetson devices using GPUDirect RDMA is viable with specific hardware and setup considerations.

DOCA GPUNetIO Compatibility

GPUDirect RDMA on Jetson

  • Jetson AGX Orin and Orin NX support GPUDirect RDMA starting with CUDA 11.2 (Drive AGX Xavier) and later. Verify compatibility using:
cudaDeviceProp props;
cudaGetDeviceProperties(&props, device_id);
if (props.gpuDirectRDMASupported) { ... } // Returns 1 if supported[1]

Key requirements:

  • NIC and GPU must share the same PCIe root complex.
  • IOMMU must be disabled or configured for 1:1 passthrough

NIC Compatibility

Orin NX’s Aquantia 10G NIC Check RDMA support via:

ethtool -i <interface> | grep driver
lspci -vvv -s <PCI_address> | grep -i 'capabilities'

AGX Orin NIC Recommendations:

  • Use Mellanox ConnectX-6/7 or BlueField-3 NICs, which support GPUDirect RDMA and RoCE/InfiniBand.
  • Ensure the NIC is PCIe Gen4 x8/x16 for optimal bandwidth

Implementation Steps

  1. Hardware Setup:
  • Install a compatible NIC on AGX Orin (e.g., ConnectX-6).
  • Confirm PCIe topology with lspci -t to ensure GPU and NIC share a root complex
  1. Software Configuration:
  • Disable IOMMU in the kernel boot parameters (intel_iommu=off or iommu.passthrough=1)
  • Use CUDA’s GPUDirect RDMA APIs for direct GPU-NIC memory transfers:
    • Register GPU memory with cudaHostRegister().
    • Map GPU memory to NIC BAR space using driver-specific APIs (e.g., ibv_reg_mr() for InfiniBand).
  1. Example Workflow :
// Allocate GPU memory
cudaMalloc(&d_data, size);
cudaHostRegister(d_data, size, cudaHostRegisterDefault);

// Register with NIC
struct ibv_mr *mr = ibv_reg_mr(pd, d_data, size, IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE);

Summary

  • Jetson AGX Orin/Orin NX can achieve CUDA-accelerated network filtering via GPUDirect RDMA, but not DOCA.
  • Replace the Aquantia NIC with an RDMA-capable NIC (e.g., ConnectX-6) for AGX Orin.
  • Focus on PCIe topology, IOMMU settings, and CUDA/NIC driver integration for optimal performance.

I hope this helps!

Thanks and have a great day!
Ilan.