When bodypostNet is from fp16 to int8, the effect is significantly worse

yezhouyin · June 28, 2022, 6:43am

Please provide the following information when requesting support.

• Hardware (Jetson TX2 xavier NX JetPack 4.6)
• Network Type (bodypostNet)
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

GitHub - NVIDIA-AI-IOT/deepstream_tao_apps: Sample apps to demonstrate how to deploy models trained with TAO on DeepStream
vim bodypose2d_pgie_config.txt
network-mode=1

The effect is obviously worse
Int8 incorrect calibration or other reasons?

Morganh · June 28, 2022, 7:23am

Please download below png file and verify as well. Thanks.
$wget https://developer-blogs.nvidia.com/wp-content/uploads/2021/06/original-image.png

yezhouyin · June 30, 2022, 9:45am

fp16:
Uploading: body2d2out_fp16.jpg…
int8:

Thank you for your answer
as the picture shows:
Obviously int8 effect is much worse

Morganh · June 30, 2022, 10:22am

Please share all the config files. Thanks.

yezhouyin · June 30, 2022, 10:32am

bodypose2d_pgie_config.txt (3.0 KB)

…
network-mode=1
or
network-mode=2

Morganh · July 1, 2022, 9:35am

I cannot reproduce your result. The int8 result is the same as fp32 model. I am testing in a machine with Geforce1080Ti.

You can comment below and retry.
#model-engine-file=…/…/models/bodypose2d/model.etlt_b32_gpu0_fp16.engine

BTW, my running command is :

./deepstream-bodypose2d-app 1 …/…/…/configs/bodypose2d_tao/sample_bodypose2d_model_config.txt file:///opt/nvidia/deepstream/deepstream-6.0/samples/configs/tao_pretrained_models/deepstream_tao_apps_6.0_ga/deepstream_tao_apps/apps/tao_others/deepstream-bodypose2d-app/original-image.png ./body2dout

Morganh · July 4, 2022, 8:16am

Could you try a dgpu machine?
Or if possible, could you try to upgrade to Jetpack5.0 ?

yezhouyin · July 4, 2022, 9:30am

Thanks for your answer, The int8 result is still as bad,
I am testing in a machine with Geforce1080Ti.

Morganh · July 4, 2022, 9:32am

Do you mean you still get bad result in Geforce 1080Ti machine? Could you share below info in your Geforce 1080Ti machine?
$ dpkg -l |grep cuda
$ ls -rltsh models/bodypose2d/

yezhouyin · July 6, 2022, 10:06am

NVIDIA GeForce RTX 3070
$ lspci | grep -i nvi
01:00.0 VGA compatible controller: NVIDIA Corporation GA104 [GeForce RTX 3070 Ti] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GA104 High Definition Audio Controller (rev a1)

Morganh · July 6, 2022, 4:16pm

How about
$ dpkg -l |grep cuda
$ ls -rltsh models/bodypose2d/

yezhouyin · July 14, 2022, 4:31am

@ ii cuda ii cuda-11-5 ii cuda-cccl-11-5 ii cuda-command-line-tools-11-5 ii cuda-compiler-11-5 ii cuda-cudart-11-5 ii cuda-cudart-dev-11-5 ii cuda-cuobjdump-11-5 ii cuda-cupti-11-5 ii cuda-cupti-dev-11-5 ii cuda-cuxxfilt-11-5 ii cuda-demo-suite-11-5 ii cuda-documentation-11-5 ii cuda-driver-dev-11-5 ii cuda-drivers ii cuda-drivers-495 ii cuda-gdb-11-5 ii cuda-libraries-11-5 ii cuda-libraries-dev-11-5 ii cuda-memcheck-11-5 ii cuda-nsight-11-5 ii cuda-nsight-compute-11-5 ii cuda-nsight-systems-11-5 ii cuda-nvcc-11-5 ii cuda-nvdisasm-11-5 ii cuda-nvml-dev-11-5 ii cuda-nvprof-11-5 ii cuda-nvprune-11-5 ii cuda-nvrtc-11-5 ii cuda-nvrtc-dev-11-5 ii cuda-nvtx-11-5 ii cuda-nvvp-11-5 ii cuda-runtime-11-5 ii cuda-samples-11-5 ii cuda-sanitizer-11-5 rc cuda-toolkit-11-4-config-common ii cuda-toolkit-11-5 ii cuda-toolkit-11-5-config-common ii cuda-toolkit-11-config-common ii cuda-toolkit-config-common ii cuda-tools-11-5 ii cuda-visual-tools-11-5 ii nv-tensorrt-repo-ub ai:~/src/deepstream_tao_apps$ dpkg -l |grep cuda
11.5.1-1 amd64 CUDA meta-package
11.5.2-1 amd64 CUDA 11.5 meta-package
11.5.62-1 amd64 CUDA CCCL
11.5.2-1 amd64 CUDA command-line tools
11.5.2-1 amd64 CUDA compiler
11.5.117-1 amd64 CUDA Runtime native Libraries
11.5.117-1 amd64 CUDA Runtime native dev links, headers
11.5.119-1 amd64 CUDA cuobjdump
11.5.114-1 amd64 CUDA profiling tools runtime libs.
11.5.114-1 amd64 CUDA profiling tools interface.
11.5.119-1 amd64 CUDA cuxxfilt
11.5.50-1 amd64 Demo suite for CUDA
11.5.114-1 amd64 CUDA documentation
11.5.117-1 amd64 CUDA Driver native dev stub library
495.29.05-1 amd64 CUDA Driver meta-package, branch-agnostic
495.29.05-1 amd64 CUDA Driver meta-package, branch-specific
11.5.114-1 amd64 CUDA-GDB
11.5.2-1 amd64 CUDA Libraries 11.5 meta-package
11.5.2-1 amd64 CUDA Libraries 11.5 development meta-package
11.5.114-1 amd64 CUDA-MEMCHECK
11.5.114-1 amd64 CUDA nsight
11.5.2-1 amd64 NVIDIA Nsight Compute
11.5.2-1 amd64 NVIDIA Nsight Systems
11.5.119-1 amd64 CUDA nvcc
11.5.119-1 amd64 CUDA disassembler
11.5.50-1 amd64 NVML native dev links, headers
11.5.114-1 amd64 CUDA Profiler tools
11.5.119-1 amd64 CUDA nvprune
11.5.119-1 amd64 NVRTC native runtime libraries
11.5.119-1 amd64 NVRTC native dev links, headers
11.5.114-1 amd64 NVIDIA Tools Extension
11.5.126-1 amd64 CUDA Profiler tools
11.5.2-1 amd64 CUDA Runtime 11.5 meta-package
11.5.56-1 amd64 CUDA example applications
11.5.114-1 amd64 CUDA Sanitizer
11.4.148-1 all Common config package for CUDA Toolkit 11.4.
11.5.2-1 amd64 CUDA Toolkit 11.5 meta-package
11.5.117-1 all Common config package for CUDA Toolkit 11.5.
11.6.55-1 all Common config package for CUDA Toolkit 11.
11.6.55-1 all Common config package for CUDA Toolkit.
11.5.2-1 amd64 CUDA Tools meta-package
11.5.2-1 amd64 CUDA visual tools
untu2004-cuda11.4-trt8.2.1.8-ga-20211117 1-1 amd64 nv-tensorrt repository configuration files

@ai:~/src/deepstream_tao_apps$ ls -rltsh models/bodypose2d/
total 65M
65M -rw-rw-r-- 1 laokc laokc 65M Jun 24 10:27 model.etlt
4.0K -rw-rw-r-- 1 laokc laokc 1.0K Jun 24 10:27 labels.txt
4.0K -rw-rw-r-- 1 laokc laokc 2.3K Jun 24 10:27 int8_calibration_320_448.txt
4.0K -rw-rw-r-- 1 laokc laokc 2.3K Jun 24 10:27 int8_calibration_288_384.txt
4.0K -rw-rw-r-- 1 laokc laokc 2.3K Jun 24 10:27 int8_calibration_224_320.txt

Morganh · July 18, 2022, 3:04am

Please update Tensorrt to 8.4 version and retry.

yezhouyin · July 27, 2022, 8:29am

ok

yingliu · August 2, 2022, 2:51am

Hello @yezhouyin Kindly let us know if the topic can be closed or not.

yezhouyin · August 8, 2022, 2:34am

Thank you for your attention, it can be closed

Topic		Replies	Views
Deepstream-bodypose2d-app use int8_calibration_320_448.txt log print (errors invalid input pafmap dimension.) TAO Toolkit tensorrt , gstreamer	14	752	July 4, 2022
The same performance with int8 and fp16 DeepStream SDK	10	1400	October 12, 2021
TAO 21.11 detectnet_v2 fallback to fp16 in DS6 TAO Toolkit tao , deepstream	8	2289	January 25, 2022
TensorRT 8.0.3 imagenet resnet model INT8 conversion identical output with different input after calibration TensorRT tensorrt	3	1314	December 23, 2021
Failed to use INT8 precision mode when using caffemodel on Xavier Jetson AGX Xavier	4	1115	October 18, 2021
Int8 Optimization on BodyPose Net fails TAO Toolkit	9	670	November 16, 2021
Poor Result After INT8 Optimization (TLT Getting Started Guide) TAO Toolkit	32	1804	October 12, 2021
Int8 is not faster than fp16 on xavier Jetson AGX Xavier tensorrt	5	867	October 18, 2021
Int8 problem TensorRT tensorrt	19	1374	May 11, 2021
[Hugging Face transformer models + pytorch_quantization] PTQ quantization int8 is slower than fp16 TensorRT tensorrt , python , onnx , natural-language-processing-nlp	4	3152	January 6, 2022

When bodypostNet is from fp16 to int8, the effect is significantly worse

Related topics