b) I was able to optimize YOLOv2 default model for INT8 (with both default and custom calibration tables). However, YOLOv3 still throws exceptions. If I try to use the default calibration table from the repo, I get the error from the first post:
Using cached calibration table to build the engine
trt-yolo-app: ../builder/cudnnBuilder2.cpp:1227: nvinfer1::cudnn::Engine* nvinfer1::builder::buildEngine(nvinfer1::CudaEngineBuildConfig&, const nvinfer1::cudnn::HardwareContext&, const nvinfer1::Network&): Assertion `it != tensorScales.end()' failed.
I see others have similar issue with other neural nets:
https://devtalk.nvidia.com/default/topic/1037060/tensorrt/trt-4-0-sampleuffssd-int8-calibration-failing/
https://devtalk.nvidia.com/default/topic/1015387/tensorrt-fails-to-build-fasterrcnn-gie-model-with-using-int8/
Maybe yolov3-calibration.table just isn’t compatible? I took the config and the weights from here: YOLO: Real-Time Object Detection
If I remove the default calibration table and try to build a custom one (filling calibration_images.txt), it throws another exception:
New calibration table will be created to build the engine
trt-yolo-app: ../builder/cudnnBuilder2.cpp:685: virtual std::vector<nvinfer1::query::Ports<nvinfer1::query::TensorRequirements> > nvinfer1::builder::Node::getSupportedFormats(const nvinfer1::query::Ports<nvinfer1::query::AbstractTensor>&, const nvinfer1::cudnn::HardwareContext&, nvinfer1::builder::Format::Type, const nvinfer1::builder::FormatTypeHack&) const: Assertion `sf' failed.
It appears that some other guys experiences the same issue: https://devtalk.nvidia.com/default/topic/1043046/tensorrt/-tensorrt-for-yolo-v3-int8-optimization-failed-/
There’s only one Cuda/TensoRT version installed:
deepstream-plugins$ dpkg -l | grep cuda
ii cuda-command-line-tools-9-0 9.0.176-1 amd64 CUDA command-line tools
ii cuda-core-9-0 9.0.176-1 amd64 CUDA core tools
ii cuda-cublas-9-0 9.0.176-1 amd64 CUBLAS native runtime libraries
ii cuda-cublas-dev-9-0 9.0.176-1 amd64 CUBLAS native dev links, headers
ii cuda-cudart-9-0 9.0.176-1 amd64 CUDA Runtime native Libraries
ii cuda-cudart-dev-9-0 9.0.176-1 amd64 CUDA Runtime native dev links, headers
ii cuda-cufft-9-0 9.0.176-1 amd64 CUFFT native runtime libraries
ii cuda-cufft-dev-9-0 9.0.176-1 amd64 CUFFT native dev links, headers
ii cuda-curand-9-0 9.0.176-1 amd64 CURAND native runtime libraries
ii cuda-curand-dev-9-0 9.0.176-1 amd64 CURAND native dev links, headers
ii cuda-cusolver-9-0 9.0.176-1 amd64 CUDA solver native runtime libraries
ii cuda-cusolver-dev-9-0 9.0.176-1 amd64 CUDA solver native dev links, headers
ii cuda-cusparse-9-0 9.0.176-1 amd64 CUSPARSE native runtime libraries
ii cuda-cusparse-dev-9-0 9.0.176-1 amd64 CUSPARSE native dev links, headers
ii cuda-demo-suite-9-0 9.0.176-1 amd64 Demo suite for CUDA
ii cuda-documentation-9-0 9.0.176-1 amd64 CUDA documentation
ii cuda-driver-dev-9-0 9.0.176-1 amd64 CUDA Driver native dev stub library
ii cuda-drivers 384.81-1 amd64 CUDA Driver meta-package
ii cuda-libraries-9-0 9.0.176-1 amd64 CUDA Libraries 9.0 meta-package
ii cuda-libraries-dev-9-0 9.0.176-1 amd64 CUDA Libraries 9.0 development meta-package
ii cuda-license-9-0 9.0.176-1 amd64 CUDA licenses
ii cuda-misc-headers-9-0 9.0.176-1 amd64 CUDA miscellaneous headers
ii cuda-npp-9-0 9.0.176-1 amd64 NPP native runtime libraries
ii cuda-npp-dev-9-0 9.0.176-1 amd64 NPP native dev links, headers
ii cuda-nvgraph-9-0 9.0.176-1 amd64 NVGRAPH native runtime libraries
ii cuda-nvgraph-dev-9-0 9.0.176-1 amd64 NVGRAPH native dev links, headers
ii cuda-nvml-dev-9-0 9.0.176-1 amd64 NVML native dev links, headers
ii cuda-nvrtc-9-0 9.0.176-1 amd64 NVRTC native runtime libraries
ii cuda-nvrtc-dev-9-0 9.0.176-1 amd64 NVRTC native dev links, headers
ii cuda-repo-ubuntu1604-9-0-local 9.0.176-1 amd64 cuda repository configuration files
ii cuda-runtime-9-0 9.0.176-1 amd64 CUDA Runtime 9.0 meta-package
ii cuda-samples-9-0 9.0.176-1 amd64 CUDA example applications
rc cuda-toolkit-9-0 9.0.176-1 amd64 CUDA Toolkit 9.0 meta-package
rc cuda-visual-tools-9-0 9.0.176-1 amd64 CUDA visual tools
ii graphsurgeon-tf 4.1.2-1+cuda9.0 amd64 GraphSurgeon for TensorRT package
ii libcuda1-384 384.130-0ubuntu0.16.04.1 amd64 NVIDIA CUDA runtime library
ii libcudnn7 7.0.5.15-1+cuda9.0 amd64 cuDNN runtime libraries
ii libcudnn7-dev 7.0.5.15-1+cuda9.0 amd64 cuDNN development libraries and headers
ii libcudnn7-doc 7.0.5.15-1+cuda9.0 amd64 cuDNN documents and samples
ii libnvinfer-dev 4.1.2-1+cuda9.0 amd64 TensorRT development libraries and headers
ii libnvinfer-samples 4.1.2-1+cuda9.0 amd64 TensorRT samples and documentation
ii libnvinfer4 4.1.2-1+cuda9.0 amd64 TensorRT runtime libraries
ii nv-tensorrt-repo-ubuntu1604-cuda9.0-ga-trt4.0.1.6-20180612 1-1 amd64 nv-tensorrt repository configuration files
ii nvinfer-runtime-trt-repo-ubuntu1404-3.0.4-ga-cuda9.0 1.0-1 amd64 nvinfer-runtime-trt repository configuration files
ii python3-libnvinfer 4.1.2-1+cuda9.0 amd64 Python 3 bindings for TensorRT
ii python3-libnvinfer-dev 4.1.2-1+cuda9.0 amd64 Python 3 development package for TensorRT
ii python3-libnvinfer-doc 4.1.2-1+cuda9.0 amd64 Documention and samples of python bindings for TensorRT
ii tensorrt 4.0.1.6-1+cuda9.0 amd64 Meta package of TensorRT
ii uff-converter-tf 4.1.2-1+cuda9.0 amd64 UFF converter for TensorRT package
Packed the test project with all the files, maybe it will help to reproduce the issue: Dropbox - File Deleted
Here’s how it’s built and run:
deepstream-plugins$ cd sources/apps/trt-yolo/
deepstream-plugins/sources/apps/trt-yolo$ make clean && make
deepstream-plugins/sources/apps/trt-yolo$ cd ../../..
deepstream-plugins$ ./sources/apps/trt-yolo/trt-yolo-app
Please note that calibration_images.txt and test_images.txt need to be updated with absolute paths.