Configuring yaml files for advanced network bench holohub

Hello,

I am currently attempting to test the Holohub application advanced network operator. To prove that we can reach the desired throughput between two devices. One device is an NVidia Jetson AGX Orin (running the application natively), and the other is a server with an A-100 card and a Connect-X6 NIC (running the application in a container). I was able to successfully build both applications on their respective devices, but whenever I run them, I am met with a plethora of errors that lead to segmentation faults. I have enabled hugepages for both the container and Jetson, set the CPU governor, and configured the YAML files to the best of my ability. On the server, I am seeing the error “Could not DMA map EXT memory” for my mlx5 device. For the Jetson, I am not getting much at all. At this point, I am stumped and would appreciate any ideas.

Thanks,

Hi there, sorry for the late reply, could you tell us more about

  • if you’re using the advanced network operator on both Jetson AGX Orin and the server
  • whether you have followed certain documentation to install drivers to enable GPU Direct RDMA from your ConnectX NIC, if so, please show us the exact documentation

Thank you!

Hi @michael.womack, can you paste the errors from one or both of your devices. Some things to make sure of:

  1. The NIC address in the config matches your devices
  2. You recompile holohub after a config change (it moves the YAML files)
  3. "Could not DMA map EXT memory” typically means you don’t have nvpeermem loaded/installed. Are you trying to do GPUDirect to a discrete GPU, or just CPU packet processing with iGPU processing?