XavierNX performance benchmark

Hello,

I have a problem with benchmark XavierNX yolo model from this repository

https://github.com/NVIDIA-AI-IOT/jetson_benchmarks

Model for tests was downloaded from benchmark csv https://www.dropbox.com/s/ck9e40b57rd5o14/yolov3-tiny-416.zip

yolov3-tiny-416 model show in table 546.69 FPS which is great, but when I start tool by myself its show 0 FPS with error Error opening engine file: ./models/yolov3-tiny-416_b8_ws2048_gpu.engine

So I converted model from onnx to engine via command myself ( convertion command was extracted from benchmark script )

/usr/src/tensorrt/bin/trtexec --onnx=yolov3-tiny-416-bs8.onnx --explicitBatch --inputIOFormats=int8:chw+chw4+chw32 --int8 --saveEngine=yolov3-tiny-416_b8_ws2048_gpu.engine

And then start test via

/usr/src/tensorrt/bin/trtexec --onnx=yolov3-tiny-416-bs8.onnx --explicitBatch --inputIOFormats=int8:chw+chw4+chw32 --int8 --workspace=2048 --avgRuns=10 --duration=180 --loadEngine=yolov3-tiny-416_b8_ws2048_gpu.engine --useSpinWait

Results

[02/07/2023-14:40:06] [I] === Performance summary ===
[02/07/2023-14:40:06] [I] Throughput: 73.9664 qps
[02/07/2023-14:40:06] [I] Latency: min = 11.0469 ms, max = 114.109 ms, mean = 13.9611 ms, median = 11.2422 ms, percentile(90%) = 11.5312 ms, percentile(95%) = 22.8438 ms, percentile(99%) = 86.875 ms
[02/07/2023-14:40:06] [I] Enqueue Time: min = 0.320312 ms, max = 108.086 ms, mean = 3.67974 ms, median = 0.476562 ms, percentile(90%) = 1.57812 ms, percentile(95%) = 18.7031 ms, percentile(99%) = 80.4922 ms
[02/07/2023-14:40:06] [I] H2D Latency: min = 0.226624 ms, max = 11 ms, mean = 0.456758 ms, median = 0.304688 ms, percentile(90%) = 0.421875 ms, percentile(95%) = 0.882812 ms, percentile(99%) = 4.23438 ms
[02/07/2023-14:40:06] [I] GPU Compute Time: min = 10.4297 ms, max = 108.539 ms, mean = 13.068 ms, median = 10.5859 ms, percentile(90%) = 10.75 ms, percentile(95%) = 19.6875 ms, percentile(99%) = 81.4336 ms
[02/07/2023-14:40:06] [I] D2H Latency: min = 0.265625 ms, max = 11.8672 ms, mean = 0.436255 ms, median = 0.34668 ms, percentile(90%) = 0.375 ms, percentile(95%) = 0.395508 ms, percentile(99%) = 3 ms
[02/07/2023-14:40:06] [I] Total Host Walltime: 180.041 s
[02/07/2023-14:40:06] [I] Total GPU Compute Time: 174.027 s
[02/07/2023-14:40:06] [W] * GPU compute time is unstable, with coefficient of variance = 89.5551%.

Which is 45~100 FPS, but nothing near 500+ is there something wrong?

P.S. was enabled max power mode 20W 6CORE + jetson_clocks enabled

Hi,

The yolov3-tiny-416-bs8.onnx model has batch size=8.
So for each iteration, there are 8 outputs generated concurrently. (FPS should x8).

Regarding your issue, would you mind running the benchmark with the below steps (benchmark.py)?
The script should be able to reproduce the same results as the table.

Thanks.

Application crashing if just repeat steps

root@ubuntu:~# git clone https://github.com/NVIDIA-AI-IOT/jetson_benchmarks.git
Cloning into 'jetson_benchmarks'...
remote: Enumerating objects: 90, done.
remote: Counting objects: 100% (52/52), done.
remote: Compressing objects: 100% (37/37), done.
remote: Total 90 (delta 28), reused 27 (delta 15), pack-reused 38
Unpacking objects: 100% (90/90), 30.59 KiB | 310.00 KiB/s, done.
root@ubuntu:~# cd jetson_benchmarks/
root@ubuntu:~/jetson_benchmarks# ls
LICENSE.md  README.md  benchmark.py  benchmark_csv  install_requirements.sh  utils
root@ubuntu:~/jetson_benchmarks# python3 utils/download_models.py --all --csv_file_path benchmark_csv/nx-benchmarks.csv --save_dir ./models
inception_v4.prototxt                                 100%[=======================================================================================================================>] 118.14K  --.-KB/s    in 0.04s
vgg19_N2.prototxt                                     100%[=======================================================================================================================>]   5.80K  --.-KB/s    in 0s
super_resolution_bsd500.zip                           100%[=======================================================================================================================>] 990.22K  6.39MB/s    in 0.2s
unet-segmentation.uff                                 100%[=======================================================================================================================>]   1.88M  7.44MB/s    in 0.3s
pose_estimation.prototxt                              100%[=======================================================================================================================>]  44.90K  --.-KB/s    in 0.02s
yolov3-tiny-416.zip                                   100%[=======================================================================================================================>] 156.92M  7.15MB/s    in 21s
ResNet50_224x224.prototxt                             100%[=======================================================================================================================>]  34.20K  --.-KB/s    in 0.02s
ssd-mobilenet-v1.zip                                  100%[=======================================================================================================================>] 171.01M  7.20MB/s    in 25s
root@ubuntu:~/jetson_benchmarks# python3 benchmark.py --all --csv_file_path benchmark_csv/nx-benchmarks.csv --model_dir ./models
Please close all other applications and Press Enter to continue...
Setting Jetson orin in max performance mode
Traceback (most recent call last):
  File "benchmark.py", line 130, in <module>
    main()
  File "benchmark.py", line 28, in main
    system_check.run_set_clocks_withDVFS()
  File "/root/jetson_benchmarks/utils/utilities.py", line 45, in run_set_clocks_withDVFS
    self.set_clocks_withDVFS(frequency=self.gpu_freq, device='gpu')
  File "/root/jetson_benchmarks/utils/utilities.py", line 74, in set_clocks_withDVFS
    from_freq = self.read_internal_register(register=freq_register_, device=device)
  File "/root/jetson_benchmarks/utils/utilities.py", line 100, in read_internal_register
    reg_read = open(register, "r")
FileNotFoundError: [Errno 2] No such file or directory: '/sys/kernel/debug/bpmp/debug/clk/nafll_gpc0/rate'
root@ubuntu:~/jetson_benchmarks#

It think device Orin?

root@ubuntu:~# lshw
ubuntu
    description: Computer
    product: NVIDIA Jetson Xavier NX Developer Kit
    vendor: Unknown
    version: Not Specified
    serial: 1421122100753
    width: 64 bits
    capabilities: smbios-3.0.0 dmi-3.0.0 smp cp15_barrier setend swp tagged_addr_disabled
    configuration: boot=normal family=Unknown sku=Unknown
  *-core
       description: Motherboard
       product: NVIDIA Jetson Xavier NX Developer Kit
       vendor: Unknown
       physical id: 0
       version: Not Specified
       serial: 1421122100753
       slot: Unknown

Hi,

Could you add the --jetson_clocks config to benchmark.py to see if it works?

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.