Please see @nhedberg’s inquiry within the closed topic AI-RAN on DGX Spark: “Did your DGX Spark install SRK’s real time OS?”
The direct answer to his question is that my DGX Spark received the same script output and did not install the Custom Kernel.
The SRK kernel documentation and step 1 within the SRK setup guide infer the custom kernel is only required on the Jetson AGX Orin/Thor platform. Based on the DGX OS User Guide, DGX OS 7 is performance-tuned but does not have a real-time kernel, though DGX Spark systems documentation imply the platform is optimized for “real-time AI applications”.
I’ve already validated SRK will deploy on DGX Spark and am actively evaluating the platform. I’ve successfully deployed and completed all of the tutorials, including Software-defined End-to-End 5G Network.
Evaluating with iperf3, my finding is that SRK’s L1/L2 performance is significantly more stable than alternative platforms (e.g. its CUDA LDPC decode finishes about ~100x faster than CPU-based alternatives). Application-layer bit-rate is comparable to other OAI-based stacks, however, is still with significant latency and round trip time degradation.
There are many reasons the base OAI platform does not achieve theoretical spectral efficiency and network performance, it is a C/C++ 3GPP R15 reference implementation with cherry-picked R16/R17 features. There are liberties taken. Many of them are located above L1 within the data plane: RLC, PDCP, RRC, and SDAP all contain fixed queue sizes and transfer happens with stateful malloc, calloc, and memcpy calls. grep reports ~800 memcpy calls across the end-to-end OAI stack. I have not observed stateless DPDK or gRPC above the O-DU.
iperf3 also has CPU-bound, non-vectorized limitations. Specifically, its packet generation frequency caps ~800 ms to 1s, meaning it is well suited for eMBB traffic profiling, but cannot meet URLLC’s 10-50 ms traffic profiles at modern 3GPP-specified reliabilities (e.g. R16+ 99.99% delivery rate). I am evaluating whether alternative DPDK/gRPC based generators can improve this and am uncertain there is an IP-based traffic generator which specifies 5G NR fidelities. If anyone knows of an accelerated, URLLC-capable IP generator: please reply or DM me.
TL;DR - I do not expect picture-perfect real-time performance from SRK at this stage of maturation, as that is a very complex, full stack ask. However, is a real-time kernel optimization on DGX Spark required to meet current measured performance levels on the custom Jetson Kernel?
Thanks,
Ryan