Jetson Orin Nano Super performance test issue

Hi ,

I am currently testing the performance of the Jetson Orin Nano Super 8GB, and I encountered some discrepancies:

According to your benchmark page (Benchmarks - NVIDIA Jetson AI Lab), the performance of Qwen2.5 7B is listed as 21.75 tokens/second.

However, when I tested using the dusty-nv/Qwen2.5-7B-Instruct-q4f16_ft-MLC model with the following command:

python3 benchmark.py --model /root/.cache/mlc_llm/dusty-nv/Qwen2.5-7B-Instruct-q4f16_ft-MLC --max-num-prompts 4 --prompt ~/.cache/mlc_llm/jetson-containers-master/data/prompts/completion_1024.json --prefill-chunk-size 1024 --save Qwen2.5-7B-Instruct-q4f16_ft-MLC.csv

I obtained a result of 19.23 tokens/second.
I have attached the image file
Qwen2.5-7B-Instruct-q4f16_ft-MLC
with the detailed results.

Could you please advise how I might be able to achieve the same performance as shown in your benchmark?

Thank you very much for your assistance!

Best regards,
richer.chan

Hi,

The benchmark is tested under the super mode.
You can set up the Orin Nano with below commands to enable super mode:

$ sudo nvpmodel -m 2
$ sudo jetson_clocks

Thanks.

Hi AastaLLL,

Thanks for your reply.
I have tried to enable the jetson_clocks and my board have been always set the nvpmodel to 2.
Then I test again but the result is nearly same as before I obtained.

Attached the log:
bench_log.txt (47.9 KB)

Please let me see if anything you might need from my side.

Thanks.

Hi,

Sorry for the late update.

Your hardware setting looks correct to me (Orin Nano super mode).
To minimize the difference, could you try the below script instead?

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.