Greetings everyone,
I have a Jetson AGX Orin 64GB and I’m testing it with the models available in NVIDIA’s Deep Learning Accelerator repo, and concretely on the “Orin Dense Performance” section of the page. Find it here: GitHub - NVIDIA/Deep-Learning-Accelerator-SW: NVIDIA DLA-SW, the recipes and tools for running deep learning workloads on NVIDIA DLA cores for inference applications.
I have cloned the repo then successfully followed the step to download the .onnx models and executing it using their provided command lines in the README.md at Deep-Learning-Accelerator-SW/scripts/prepare_models/README.md. However, when I try to reproduce their results reported on the “DLA dense performance” section, my performance is quite below from theirs.
Logs on verbose mode are here. I can only put four links in my post as a restriction for being a new user, so I’ll attach these two, but the other four models were presenting the same behavior.
log_resnet50_MAXN.txt (246.6 KB)
log_ssd_mobilenetv1_MAXN.txt (242.2 KB)
- ResNet-50: theirs is 2037 fps, mine is 504 qps * 2 (batch) = 1008 fps.
- SSD-MobileNetV1: theirs is 2664 fps, mine is 655 qps * 2 (batch) = 1310 fps.
For the other models whose logs were omitted in this comment:
- RetinaNet ResNeXt-50: theirs is 78 fps, mine is 39 fps
- RetinaNet ResNet-34: theirs is 108 fps, mine is 53 fps
- SSD-ResNet-34: theirs is 83 fps, mine is 41 fps
My results seems to be constantly around half of their reported results. I copy and paste their command lines for execution so I don’t think I’m missing an option here. I double checked that I was on MAXN power mode. I do not understand what I’m missing.
Thanks in advance :)