Setup details:
Platform 1 (Storage Target):
- SSD
- ConnectX-6 VPI; 100Gb/s (HDR100; EDR IB and 100GbE); single-port QSFP56; PCIe3.0 x16 Adapter
- Optical cable
- Ubuntu 20.04 server
Configured the setup with details from ESPCommunity using Infiniband.
Platform 2: DPU with Host Server:
- NVIDIA BlueField-3 B3140L E-Series FHHL SuperNIC
with parameters NVME_EMULATION_ENABLE=1,NVME_EMULATION_NUM_PF=1 for NVME emulation. - Ubuntu 20.04 on Host OS
Configured SNAP with SNAP_RDMA_ZCOPY_ENABLE for single controller with example from SNAP Deployment - NVIDIA Docs for single controller and able to see the controller on the host and perform IO operations successfully on the SSD sitting on the storage target. I am looking to optimize for maximum iops and throughput .
Things tried for maximizing throughput:
- Switched NVME_EMU_provider = dpu instead of dpa . Received the error on host: NVMExpressPassThru: Timeout occurs for an NVMe command. NvmExpress: Failed to identify active namespace.
- Increased memory limits in doca_snap.yaml. No difference in IOPS.
Is there any other DPU configuration or SNAP configuration that would help increase / maximum throughput/IOPS for doing IO operations?