Wanted to know whether --int8 option the trtexec command line has taken into effect

Hi all, I want to know whether the option --int8 I have specified has actually taken into effect.

I have run the following command
./trtexec --onnx=…/data/resnet50/ResNet50.onnx --int8

I have found in the logs that displayed the below sentence

[11/15/2023-00:53:37] [I] FP32 and INT8 precisions have been specified - more performance might be enabled by additionally specifying --fp16 or --best
[11/15/2023-00:53:37] [I] [TRT] [MemUsageSnapshot] Builder begin: CPU 471 MiB, GPU 4156 MiB
[11/15/2023-00:53:37] [W] [TRT] Calibrator is not being used. Users must provide dynamic range for all tensors that are not Int32.

Does it mean that the int8 precision is not being used or taken place and it has used int32? Also what does it mean saying Calibrator is not being used. Is it the right way? If not what should be done to make it to use calibrator.

Attached is the captured log file. Request you to please go through it and clarify me whether the int8 option has actually taken into effect.

Thanks and Regards

Nagaraj Trivedi
trtexec_onnx_resnet_50_int8.txt (18.6 KB)

Dear @trivedi.nagaraj,
The used command gives the perf stats using int8 precision. Do you see better perf timings when using int8 over fp32? Please check --verbose option with both fp32 and int8.

Yes, with --int8 option the performance improved.
With --int8 the throughput is 58.7143 qps as compared to 7.76857 qps with --fp32 option
Below are the performance summary of --int8 and --fp32
Performance summary with --int8

[11/15/2023-10:41:12] [I] === Performance summary ===
[11/15/2023-10:41:12] [I] Throughput: 58.7143 qps
[11/15/2023-10:41:12] [I] Latency: min = 16.917 ms, max = 17.2163 ms, mean = 17.0075 ms, median = 17.0064 ms, percentile(99%) = 17.1074 ms
[11/15/2023-10:41:12] [I] End-to-End Host Latency: min = 16.9333 ms, max = 17.2334 ms, mean = 17.026 ms, median = 17.0251 ms, percentile(99%) = 17.1235 ms
[11/15/2023-10:41:12] [I] Enqueue Time: min = 12.573 ms, max = 17.5194 ms, mean = 13.8852 ms, median = 13.8246 ms, percentile(99%) = 16.6606 ms
[11/15/2023-10:41:12] [I] H2D Latency: min = 0.207031 ms, max = 0.210693 ms, mean = 0.20807 ms, median = 0.207825 ms, percentile(99%) = 0.2099 ms
[11/15/2023-10:41:12] [I] GPU Compute Time: min = 16.6964 ms, max = 16.9951 ms, mean = 16.7863 ms, median = 16.7853 ms, percentile(99%) = 16.8868 ms
[11/15/2023-10:41:12] [I] D2H Latency: min = 0.0129395 ms, max = 0.0134277 ms, mean = 0.0131503 ms, median = 0.0131836 ms, percentile(99%) = 0.0134277 ms
[11/15/2023-10:41:12] [I] Total Host Walltime: 3.03163 s
[11/15/2023-10:41:12] [I] Total GPU Compute Time: 2.98795 s
[11/15/2023-10:41:12] [11/15/2023-10:41:12] [11/15/2023-10:41:12] [I] Explanations of the performance metrics are printed in the verbose logs.
[11/15/2023-10:41:12] [V]

Performance summary with --fp32

[11/15/2023-10:16:59] [I] === Performance summary ===
[11/15/2023-10:16:59] [I] Throughput: 7.76857 qps
[11/15/2023-10:16:59] [I] Latency: min = 128.443 ms, max = 128.93 ms, mean = 128.701 ms, median = 128.683 ms, percentile(99%) = 128.93 ms
[11/15/2023-10:16:59] [I] End-to-End Host Latency: min = 128.458 ms, max = 128.946 ms, mean = 128.718 ms, median = 128.7 ms, percentile(99%) = 128.946 ms
[11/15/2023-10:16:59] [I] Enqueue Time: min = 16.1404 ms, max = 20.4997 ms, mean = 17.1669 ms, median = 17.0371 ms, percentile(99%) = 20.4997 ms
[11/15/2023-10:16:59] [I] H2D Latency: min = 0.207458 ms, max = 0.211426 ms, mean = 0.208414 ms, median = 0.208008 ms, percentile(99%) = 0.211426 ms
[11/15/2023-10:16:59] [I] GPU Compute Time: min = 128.221 ms, max = 128.708 ms, mean = 128.479 ms, median = 128.46 ms, percentile(99%) = 128.708 ms
[11/15/2023-10:16:59] [I] D2H Latency: min = 0.013916 ms, max = 0.0141602 ms, mean = 0.0139905 ms, median = 0.013916 ms, percentile(99%) = 0.0141602 ms
[11/15/2023-10:16:59] [I] Total Host Walltime: 3.2181 s
[11/15/2023-10:16:59] [I] Total GPU Compute Time: 3.21197 s
[11/15/2023-10:16:59] [I] Explanations of the performance metrics are printed in the verbose logs.

That indicates, INT8 is considered.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.