Lower performance with DLA enabled

We expected to get higher inference performance on NX.
However, we got lower performance with DLA enabled as testing YoloV3 model.

Yolov3 default got **PERF: 57.03 (55.15)
Yolov3 with DLA enabled got **PERF: 46.35 (46.82)

Added the following config into config_infer_primary_yoloV3/property
enable-dla=1
use-dla-core=0

Did we miss any?

Thank you for any advice,

1 Like

Hi,

DLA is an extra hardware on Jetson Xavier.
To run a model on DLA is to offload the GPU resource rather than acceleration.

Thanks.

Hi AastaLLL,

Thank you for your prompt support.

Therefore, using DLA may not get higher performance.
How could we get higher inference performance on NX?
The document of NX shows Tiny Yolo V3 performance is around 562 fps.

However, we got the test result below.
Yolov3 with interval=2 got **PERF: 57.03 (55.15)
Yolov3 with interval=0 got **PERF: 25~26

Are the test results normal?
or anything we can do to improve it?

Plus, is there a way to enable two DLA at the same time? How?
Does enabled two DLA at the same time help on inference performance or not?

Thank you,

Hi,

Sorry that there are some unclear statement in my previous reply.

It won’t get you a higher performance if inference with only DLA.
Since the target for DLA is power saving and offload GPU usage.

However, for XavierNX, you can try to use 2DLA and GPU together.
This will get you the maximal performance of XavierNX.
We also use 2DLA together with GPU to get the 562 performance.

To reproduce the benchmark result, you can use our script here directly:

Thanks.

Hi AastaLLL,

Thank you for your clear explanation.

One more question, is there a way to enable two DLAs at the same time in deepstream? how?

Thank you,

Hi AastaLLL,

We got error as running benchmark.py.

NVIDIA-AI-IOT/ jetson_benchmarks()

From the logs, shows Error opening engine file.
for example: yolov3-tiny-416_b8_ws2048_gpu.txt (2.7 KB)

Is there a way to get the engine? could we download it? or create it?

Plus, we still expect the update of “is there a way to enable two DLAs at the same time in deepstream? how?”

Thank you,

Hi,

Suppose you should use the command here:

You can run one model on the DLA each time.
So to use 2DLA together, you can apply following mechanism.

  1. Run detector on DLA0 and classifier on DLA1
  2. Run two pipeline on DLA0 and DLA1 with the same model.

Thanks.

Hi AastaLLL,

Thank you for your prompt support.

Yes, I did follow the website you provided.
Here are the steps I did.
git clone https://github.com/NVIDIA-AI-IOT/jetson_benchmarks.git
cd jetson_benchmarks
mkdir models
sudo sh install_requirements.sh
python3 utils/download_models.py --all --csv_file_path <path-to>/benchmark_csv/nx-benchmarks.csv --save_dir <absolute-path-to-downloaded-models>

sudo python3 benchmark.py --all --csv_file_path /benchmark_csv/nx-benchmarks.csv --model_dir

The error message is as the attached.
err.log (3.0 KB)

Is there any I missed?

Thank you,

Hi,

Do you have the log in model/?

Error in Build, Please check the log in: models

We are going to check this issue. Will share more information with you later.
Thanks.

Hi

Thank you,

Hi,

We just check the script on XavierNX with JetPack4.4.
Everything works fine and we can also get the similar performance.

Not sure why you meet an error in yolo benchmark.
Would you mind to reflash the device and try it again?

Thanks.

Hi AastaLLL,

Sorry. I still got the error after re-flashed NX.
The Jetson is NX emmc module with Nano b1 carrier board.
Here is what I did.
msg1.log (26.9 KB)

Did I miss any?

However, I manually got the engine and tested yolov3tiny model and got the mean around 20 ms. Is it normal?

Thank you for any advice,