Run AI models completely on Jetson AGX Orin DLAs

Reading the Figure 1. DLA power efficiency in the post:
https://developer.nvidia.com/blog/maximizing-deep-learning-performance-on-nvidia-jetson-orin-with-dla/#:~:text=DLA%20performance%20per%20watt%20is,models%20representing%20common%20use%20cases.

and having one question: most of the models(ResnetXt-50, ResNet-34, SSD-MobileNetV1) can’t be fully optimized to run on DLAs 100% (so the GPUFallback takes care of the unsupported operations on GPU), then how to report the performance and power on DLAs in the pasted post?

Dear @hank.fang.usa,
So, your ask is how to account only performance of DLA when the network has non DLA supported layers?

yes, I used to create the model’s Trt engine first, then use trtexec to do the inferencing on Jetson AGX Orin on DLA. From the log, I can see some layers are running on GPU, so I guess the reported performance number by trtexec is combination from both DLA and GPU. I am wondering how the post can separate that clearly and claim 3-5x power efficiency (DLA vs. GPU).

Dear @hank.fang.usa,
The models in shared blog can run completely on DLA. So the comparison is across DLA time vs iGPU for different power profiles. Please see see GitHub - NVIDIA/Deep-Learning-Accelerator-SW: NVIDIA DLA-SW, the recipes and tools for running deep learning workloads on NVIDIA DLA cores for inference applications. and click on the model names for repro steps. all models run on Orin’s DLA without GPU fallback using JP 5.1.1 and later.