I’m running some network and NVMe benchmarks using a Orin NX system running JetPack 5.1.2.
The setup consist of a PCIe switch installed on the Gen4 x4 M.2 PCIe slot. The PCIe switch is connected to dual port 25GbE Mellanox card, and 6 NVMe SSDs on the other PCIe switch downstream port.
The NX system is able to receive data on the two 25Gbps ethernet ports at an aggregated rate of around 6000 MB/s. The NX system is also capable of writing out to the NVMe SSDs at a rate of 6000 MB/s. However, when I try receiving data over the 25GbE network interfaces at the same time as writing out to the NVMe SSDs, the aggregate network receive rate drops to around 3500 MB/s. The CPU utilization is only around 50%, so I’m not sure why the NVMe SSD write activity is slowing down the network receive rate so much. The NX system is able to send and receive over the two 25GbE interfaces at 6000 MB/s in both directions simultaneously, so it doesn’t appear to be a PCIe bus or memory limitation.
I don’t understand why the network rates slow down so much with NVMe activity when there appears to plenty of free CPU resources available. Interrupts appear to be getting distributed fairly evenly between the 8 CPUs.
~$ cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
311: 6 0 0 0 0 0 0 0 MSI 541065216 Edge nvme0q0
312: 6 0 0 0 0 0 0 0 MSI 541589504 Edge nvme1q0
313: 6 0 0 0 0 0 0 0 MSI 542638080 Edge nvme3q0
314: 628763 0 0 0 0 0 0 0 MSI 541065217 Edge nvme0q1
315: 6 0 0 0 0 0 0 0 MSI 543162368 Edge nvme4q0
316: 43 0 0 0 0 0 628338 0 MSI 541589505 Edge nvme1q1
317: 6 0 0 0 0 0 0 0 MSI 544735232 Edge nvme5q0
322: 40 0 0 1511 0 626763 0 0 MSI 542638081 Edge nvme3q1
323: 40 0 628135 0 0 0 0 0 MSI 543162369 Edge nvme4q1
324: 391 0 0 0 0 0 628765 0 MSI 544735233 Edge nvme5q1
329: 6 0 0 0 0 0 0 0 MSI 542113792 Edge nvme2q0
330: 53 0 0 0 628198 0 0 0 MSI 542113793 Edge nvme2q1
350: 0 0 5 0 0 0 0 7663880 MSI 539492362 Edge mlx5_comp10@pci:0004:05:00.0
363: 0 7909327 3 0 0 0 0 31718 MSI 539494410 Edge mlx5_comp10@pci:0004:05:00.1
I have already tried running jetson_clocks and setting the nvpmodle to 0 for max power. Is there anything else I can do to help improve performance?