Hi @user22290, the performance difference is due to the number of TPCs and the GPU Frequency in each mode:
The Jetson AGX Orin 15W mode uses 3 TPCs with a GPU Frequency of 420.75 MHz
The Orin Nano 15W modes uses 4 TPCs with a GPU Frequency at 625 MHz
Running more tensor cores at a higher maximum frequency will have more performance. The reason for the difference in the two modes, is because Jetson AGX Orin also contains the Deep Learning Accelerator and Vision Accelerator, which can be operated in the 15W mode, compared to the Orin Nano. These accelerators can be used to offload portions of your AI and CV applications on AGX Orin. Orin Nano does not contain a DLA or Vision Accelerator, and therefore more power is budgeted to the GPU.
You can create your own custom nvpmodel depending on your usecase that could run the gpu at the same frequencies and number of TPCs as the Orin Nano without the accelerators, but on the AGX Orin. Using the Jetson Power Estimator Tool (https://jetson-tools.nvidia.com/powerestimator/) you can identify the optimal perf and power for your AGX Orin application to create your own custom nvpmodel.
But I don’t use the DLA for my comparison. Even if I don’t use DLA, it performs with 420.75 MHZ. So the budget is allocated even if I don’t use DLA. Does the custom nvpmodel give me better results on Orin AGX in comparison to Nano or the same results? Is it better to use DLA or the custom NVPmodel for the best GPU frequency?
Yes, the nvpmodel specifies the maximum frequencies as if the DLA were in use, as to not exceed the power budget of 15W.
By using the Power Estimator Tool and creating your own nvpmodel, you can tweak the frequencies for your use-case such that they are more GPU-focused if that is your desire.
The DLA typically has lower power consumption while the GPU has higher performance (i.e. lower latency) for inferencing. You could benchmark both for your application to determine which is best for your use-case.