Hi there,
I am working on deploy my segmentation model on Jetson Orin NX and Jetson Orin Nano, so I compare the performance of two device.
I found some intresting points:
- Orin Nano with high GPU frequency compare with Orin NX(25W mode), so Orin Nano 8GB even faster than Orin NX 16GB when infering (Low latency and high throughput), but for my case ,8GB is not enough.
- Even Orin NX has DLA, from the spec, the role of DLA is to relieve the burden on the GPU, enabling the transfer of certain layers in the model to run on the DLA. It also offers high power efficiency. However, after moving the layers to the DLA, although the GPU load is reduced, the inference speed is slower compared to running on the GPU.
So this creates an awkward situation. If I want to use a 16GB device, my only option is the Orin NX. However, the inference performance of the Orin NX is inferior to that of the Orin Nano. Even though it has DLA, the inference using DLA is slower than that on the GPU.
Any suggestion about how to config the Orin NX to optimize the throughout with my model(already convert to tensorrt with DLA and without DLA and already do the precision speedup).
Or any other device suit for my case? What I want is the best performance with throughput and with 16GB memory.
BTW, I noticed that I can switch the power mode to MAXN, but does it a best practice? For I also noticed that it will cause the system to unstable.
Thanks.




