Why Nvidia Nano Orin 15w give me better results in comparison to Orin AGX 15w

Hi
I am trying Orin Agx (15w) and Orin Nano (15w) for running the yolov5 family. I got Instance per second for each:

                             IPS Orin AGX	IPS Orin Nano

yolov5s int8 80.08585203 122.5208746
yolov5s fp16 64.40725999 94.24570301
yolov5mint8 45.31591994 74.08202274
yolov5mfp 16 31.65558721 48.69560749
yolov5lint8 32.67226451 55.15715603
yolov5lfp16 20.4430416 32.81257126
yolov5xint8 18.50656465 30.93539371
yolov5xfp16 11.57943492 18.20232719

strangely the yolov5 gets better results on Orin Nano.

I wanted to know what is the reason for getting better results on Nano. Also, I checked the power of GPU and also GPU frequency.

Nano Orin GPU frequency is 627.
AGX Orin GPU frequency is 408.

Nano Orin VDD_GPU_SOC-IN_POWER jumps to 9000.
Orin AGX VDD_GPU_SOC-IN_POWER is around 6500.

Is it normal that Orin Nano get better results than AGX orin 15w?

Hello @user22290

This needs to be posted in the Jetson category, so the support team has visibility.

I can move it over for you.

Please move it to jetson category. Thanks

Done. :-)

Hi @user22290, the performance difference is due to the number of TPCs and the GPU Frequency in each mode:

  • The Jetson AGX Orin 15W mode uses 3 TPCs with a GPU Frequency of 420.75 MHz
  • The Orin Nano 15W modes uses 4 TPCs with a GPU Frequency at 625 MHz

Running more tensor cores at a higher maximum frequency will have more performance. The reason for the difference in the two modes, is because Jetson AGX Orin also contains the Deep Learning Accelerator and Vision Accelerator, which can be operated in the 15W mode, compared to the Orin Nano. These accelerators can be used to offload portions of your AI and CV applications on AGX Orin. Orin Nano does not contain a DLA or Vision Accelerator, and therefore more power is budgeted to the GPU.

You can create your own custom nvpmodel depending on your usecase that could run the gpu at the same frequencies and number of TPCs as the Orin Nano without the accelerators, but on the AGX Orin. Using the Jetson Power Estimator Tool (https://jetson-tools.nvidia.com/powerestimator/) you can identify the optimal perf and power for your AGX Orin application to create your own custom nvpmodel.

But I don’t use the DLA for my comparison. Even if I don’t use DLA, it performs with 420.75 MHZ. So the budget is allocated even if I don’t use DLA. Does the custom nvpmodel give me better results on Orin AGX in comparison to Nano or the same results? Is it better to use DLA or the custom NVPmodel for the best GPU frequency?

Yes, the nvpmodel specifies the maximum frequencies as if the DLA were in use, as to not exceed the power budget of 15W.

By using the Power Estimator Tool and creating your own nvpmodel, you can tweak the frequencies for your use-case such that they are more GPU-focused if that is your desire.

The DLA typically has lower power consumption while the GPU has higher performance (i.e. lower latency) for inferencing. You could benchmark both for your application to determine which is best for your use-case.

I checked the numbers of Tcp in AGX orin is 8. Why it is just using 4 or 3 of them?

I think because you have it in 15W mode, and in that mode it disables TPC’s in order to meet the 15W power budget.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.