ERROR: [TRT} stdArchiveReader ... Serialization assertion

Host computer - Ubuntu, (2) RTX 2080 TIs;
Driver Version: 515.65.01 CUDA Version: 11.7

TAO environment - docker based:
FROM nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.4-py3
cv_samples_vv1.4.1
I trained DetectNet V2 Resnet 18 using the jupyter notebook in cv_samples_vv1.4.1
works perfectly. passes all tests. inference validation works as expected.

Deepstream 6.1 Environment - docker based
FROM nvcr.io/nvidia/deepstream:6.1-devel
I am using deepstream-transfer-learning-app

a) I build the sample app (default uses resnet10 caffe model) works perfectly
b) using a config (see below) to utilize my newly trained DetectNet model, I get an error:

ERROR: [TRT]: 1: [stdArchiveReader.cpp::StdArchiveReader::40] Error Code 1: Serialization (Serialization assertion stdVersionRead == serializationVersion failed.Version tag does not match. Note: Current Version: 213, Serialized Engine Version: 205)
ERROR: [TRT]: 4: [runtime.cpp::deserializeCudaEngine::50] Error Code 4: Internal Error (Engine deserialization failed.)

I also have a computer with an RTX 3060 TI.
Same experiment, train the model then use it with deepstream-transfer-learning-app
I get a different result. I get a warning but it works.

WARNING: [TRT]: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
0:00:02.472646988 68 0x565156e7d490 INFO nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1900> [UID = 1]: deserialized trt engine from :/project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.trt.int8

I’m using the same computer for training (TAO) and Deepstream in both cases (computer A has (2) RTX 2080 TIs, computer b has RTX 3060 TI. I did not cross models between the two computers.)

here is my infer config:

cat /project/deepstream/prod_model/dn2_rs18/config_infer_primary.yml

property:
gpu-id: 0
net-scale-factor: 0.00392156862745098
int8-calib-file: /project/deepstream/prod_model/dn2_rs18/calibration_qat.bin
model-engine-file: /project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.trt.int8
labelfile-path: /project/deepstream/prod_model/dn2_rs18/labels.txt
batch-size: 24
process-mode: 1
offsets: 0.0;0.0;0.0
infer-dims: 3;384;1248
tlt-model-key:
network-type: 0
num-detected-classes: 3
uff-input-order: 0
output-blob-names: output_cov/Sigmoid;output_bbox/BiasAdd
uff-input-blob-name: input_1
model-color-format: 0
maintain-aspect-ratio: 0

This kind of error is usually due to different tensorrt versions.
You can comment out
model-engine-file: /project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.trt.int8

and let deepstream generate engine in current environment.

Thanks for you help. I tried your suggestion. It didn’t work on my computer with (2) RTX 2080 TIs. I tried some combinations and here are the results.

This is the config_infer_primary.yml:

property:
gpu-id: 0
net-scale-factor: 0.00392156862745098
#1 int8-calib-file: /project/deepstream/prod_model/dn2_rs18/calibration_qat.bin
#2 model-engine-file: /project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.trt.int8
#3 model-engine-file: /project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.etlt
#4 model-engine-file: /project/deepstream/prod_model/dn2_rs18/resnet18_detector.etlt
labelfile-path: /project/deepstream/prod_model/dn2_rs18/labels.txt
batch-size: 24
process-mode: 1
offsets: 0.0;0.0;0.0
infer-dims: 3;384;1248
tlt-model-key: (mykey)
network-type: 0
num-detected-classes: 3
uff-input-order: 0
output-blob-names: output_cov/Sigmoid;output_bbox/BiasAdd
uff-input-blob-name: input_1
model-color-format: 0
maintain-aspect-ratio: 0

So I tried with each of the 4 respective bin/engine files. I uncommented one of the 4 lines at a time.

case 1 – this would be equivalent to your suggestion of commenting out my *detector_qat.trt.int8
int8-calib-file: /project/deepstream/prod_model/dn2_rs18/calibration_qat.bin

0:00:00.165909648 1272 0x55c770a85b60 INFO nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1923> [UID = 1]: Trying to create engine from model files
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:860 failed to build network since there is no model file matched.
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:799 failed to build network.
0:00:00.895838689 1272 0x55c770a85b60 ERROR nvinfer gstnvinfer.cpp:640:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1943> [UID = 1]: build engine file failed
0:00:00.917822891 1272 0x55c770a85b60 ERROR nvinfer gstnvinfer.cpp:640:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2029> [UID = 1]: build backend context failed
0:00:00.917926592 1272 0x55c770a85b60 ERROR nvinfer gstnvinfer.cpp:640:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1266> [UID = 1]: generate backend failed, check config file settings
0:00:00.917948713 1272 0x55c770a85b60 WARN nvinfer gstnvinfer.cpp:846:gst_nvinfer_start:<primary_gie> error: Failed to create NvDsInferContext instance
0:00:00.917954083 1272 0x55c770a85b60 WARN nvinfer gstnvinfer.cpp:846:gst_nvinfer_start:<primary_gie> error: Config file path: /project/deepstream/prod_model/dn2_rs18/config_infer_primary.yml, NvDsInfer Error: NVDSINFER_CONFIG_FAILED

case 2:
model-engine-file: /project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.trt.int8

ERROR: [TRT]: 1: [stdArchiveReader.cpp::StdArchiveReader::40] Error Code 1: Serialization (Serialization assertion stdVersionRead == serializationVersion failed.Version tag does not match. Note: Current Version: 213, Serialized Engine Version: 205)
ERROR: [TRT]: 4: [runtime.cpp::deserializeCudaEngine::50] Error Code 4: Internal Error (Engine deserialization failed.)
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:1528 Deserialize engine failed from file: /project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.trt.int8
0:00:00.908849557 1316 0x55e981636b60 WARN nvinfer gstnvinfer.cpp:643:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1897> [UID = 1]: deserialize engine from file :/project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.trt.int8 failed
0:00:00.929464099 1316 0x55e981636b60 WARN nvinfer gstnvinfer.cpp:643:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2002> [UID = 1]: deserialize backend context from engine from file :/project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.trt.int8 failed, try rebuild
0:00:00.929479429 1316 0x55e981636b60 INFO nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1923> [UID = 1]: Trying to create engine from model files
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:860 failed to build network since there is no model file matched.
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:799 failed to build network.
0:00:01.435991985 1316 0x55e981636b60 ERROR nvinfer gstnvinfer.cpp:640:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1943> [UID = 1]: build engine file failed
0:00:01.456876270 1316 0x55e981636b60 ERROR nvinfer gstnvinfer.cpp:640:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2029> [UID = 1]: build backend context failed
0:00:01.456897270 1316 0x55e981636b60 ERROR nvinfer gstnvinfer.cpp:640:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1266> [UID = 1]: generate backend failed, check config file settings
0:00:01.456912820 1316 0x55e981636b60 WARN nvinfer gstnvinfer.cpp:846:gst_nvinfer_start:<primary_gie> error: Failed to create NvDsInferContext instance
0:00:01.456916560 1316 0x55e981636b60 WARN nvinfer gstnvinfer.cpp:846:gst_nvinfer_start:<primary_gie> error: Config file path: /project/deepstream/prod_model/dn2_rs18/config_infer_primary.yml, NvDsInfer Error: NVDSINFER_CONFIG_FAILED

case 3:
model-engine-file: /project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.etlt

ERROR: [TRT]: 1: [stdArchiveReader.cpp::StdArchiveReader::30] Error Code 1: Serialization (Serialization assertion magicTagRead == kMAGIC_TAG failed.Magic tag does not match)
ERROR: [TRT]: 4: [runtime.cpp::deserializeCudaEngine::50] Error Code 4: Internal Error (Engine deserialization failed.)
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:1528 Deserialize engine failed from file: /project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.etlt
0:00:00.881359827 1357 0x55d48ed1dd60 WARN nvinfer gstnvinfer.cpp:643:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1897> [UID = 1]: deserialize engine from file :/project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.etlt failed
0:00:00.902421478 1357 0x55d48ed1dd60 WARN nvinfer gstnvinfer.cpp:643:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2002> [UID = 1]: deserialize backend context from engine from file :/project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.etlt failed, try rebuild
0:00:00.902436368 1357 0x55d48ed1dd60 INFO nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1923> [UID = 1]: Trying to create engine from model files
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:860 failed to build network since there is no model file matched.
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:799 failed to build network.
0:00:01.409074656 1357 0x55d48ed1dd60 ERROR nvinfer gstnvinfer.cpp:640:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1943> [UID = 1]: build engine file failed
0:00:01.430537970 1357 0x55d48ed1dd60 ERROR nvinfer gstnvinfer.cpp:640:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2029> [UID = 1]: build backend context failed
0:00:01.430557570 1357 0x55d48ed1dd60 ERROR nvinfer gstnvinfer.cpp:640:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1266> [UID = 1]: generate backend failed, check config file settings
0:00:01.430575600 1357 0x55d48ed1dd60 WARN nvinfer gstnvinfer.cpp:846:gst_nvinfer_start:<primary_gie> error: Failed to create NvDsInferContext instance
0:00:01.430579741 1357 0x55d48ed1dd60 WARN nvinfer gstnvinfer.cpp:846:gst_nvinfer_start:<primary_gie> error: Config file path: /project/deepstream/prod_model/dn2_rs18/config_infer_primary.yml, NvDsInfer Error: NVDSINFER_CONFIG_FAILED

case 4:
model-engine-file: /project/deepstream/prod_model/dn2_rs18/resnet18_detector.etlt

ERROR: [TRT]: 1: [stdArchiveReader.cpp::StdArchiveReader::30] Error Code 1: Serialization (Serialization assertion magicTagRead == kMAGIC_TAG failed.Magic tag does not match)
ERROR: [TRT]: 4: [runtime.cpp::deserializeCudaEngine::50] Error Code 4: Internal Error (Engine deserialization failed.)
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:1528 Deserialize engine failed from file: /project/deepstream/prod_model/dn2_rs18/resnet18_detector.etlt
0:00:00.920260287 1394 0x5651c4fe9960 WARN nvinfer gstnvinfer.cpp:643:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1897> [UID = 1]: deserialize engine from file :/project/deepstream/prod_model/dn2_rs18/resnet18_detector.etlt failed
0:00:00.941272524 1394 0x5651c4fe9960 WARN nvinfer gstnvinfer.cpp:643:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2002> [UID = 1]: deserialize backend context from engine from file :/project/deepstream/prod_model/dn2_rs18/resnet18_detector.etlt failed, try rebuild
0:00:00.941287905 1394 0x5651c4fe9960 INFO nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1923> [UID = 1]: Trying to create engine from model files
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:860 failed to build network since there is no model file matched.
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:799 failed to build network.
0:00:01.444696177 1394 0x5651c4fe9960 ERROR nvinfer gstnvinfer.cpp:640:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1943> [UID = 1]: build engine file failed
0:00:01.465987467 1394 0x5651c4fe9960 ERROR nvinfer gstnvinfer.cpp:640:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2029> [UID = 1]: build backend context failed
0:00:01.466005447 1394 0x5651c4fe9960 ERROR nvinfer gstnvinfer.cpp:640:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1266> [UID = 1]: generate backend failed, check config file settings
0:00:01.466022137 1394 0x5651c4fe9960 WARN nvinfer gstnvinfer.cpp:846:gst_nvinfer_start:<primary_gie> error: Failed to create NvDsInferContext instance
0:00:01.466025747 1394 0x5651c4fe9960 WARN nvinfer gstnvinfer.cpp:846:gst_nvinfer_start:<primary_gie> error: Config file path: /project/deepstream/prod_model/dn2_rs18/config_infer_primary.yml, NvDsInfer Error: NVDSINFER_CONFIG_FAILED

on my computer with the RTX 3060 TI. (same versions and process as the 2080 TI computer),

here is my config_infer_primary.yml

property:
gpu-id: 0
net-scale-factor: 0.00392156862745098
int8-calib-file: /project/deepstream/prod_model/dn2_rs18/calibration_qat.bin
model-engine-file: /project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.trt.int8
labelfile-path: /project/deepstream/prod_model/dn2_rs18/labels.txt
batch-size: 24
process-mode: 1
offsets: 0.0;0.0;0.0
infer-dims: 3;384;1248
tlt-model-key: (my key)
network-type: 0
num-detected-classes: 3
uff-input-order: 0
output-blob-names: output_cov/Sigmoid;output_bbox/BiasAdd
uff-input-blob-name: input_1
model-color-format: 0
maintain-aspect-ratio: 0

it runs fine - but throws a warning
WARNING: [TRT]: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
0:00:02.532990356 333 0x562549051090 INFO nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1900> [UID = 1]: deserialized trt engine from :/project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.trt.int8
INFO: …/nvdsinfer/nvdsinfer_model_builder.cpp:610 [Implicit Engine Info]: layers num: 3

I am assuming that the TRT engine version is the version mismatch. And, I’m assuming that this was generated: cv_sammples_vv1.4.1 detectnet v2 / Step 10.B in the provided jupyter notebook. Is this correct? (I am running all steps on the same machine between TAO and Deepstream).

If I $ tao converter
which puts me in a docker container,
then execute:

dpkg -l | grep TensorRT
ii libnvinfer-bin 8.2.5-1+cuda11.4 amd64 TensorRT binaries
ii libnvinfer-dev 8.2.5-1+cuda11.4 amd64 TensorRT development libraries and headers
ii libnvinfer-plugin-dev 8.2.5-1+cuda11.4 amd64 TensorRT plugin libraries
ii libnvinfer-plugin8 8.2.5-1+cuda11.4 amd64 TensorRT plugin libraries
ii libnvinfer8 8.2.5-1+cuda11.4 amd64 TensorRT runtime libraries
ii libnvonnxparsers-dev 8.2.5-1+cuda11.4 amd64 TensorRT ONNX libraries
ii libnvonnxparsers8 8.2.5-1+cuda11.4 amd64 TensorRT ONNX libraries
ii libnvparsers-dev 8.2.5-1+cuda11.4 amd64 TensorRT parsers libraries
ii libnvparsers8 8.2.5-1+cuda11.4 amd64 TensorRT parsers libraries
ii python3-libnvinfer 8.2.5-1+cuda11.4 amd64 Python 3 bindings for TensorRT
ii python3-libnvinfer-dev 8.2.5-1+cuda11.4 amd64 Python 3 development package for TensorRT

Then if I go back to a deepstream-6.1.1-devel container and do the same thing:

dpkg -l | grep TensorRT
ii graphsurgeon-tf 8.4.1-1+cuda11.6 amd64 GraphSurgeon for TensorRT package
ii libnvinfer-bin 8.4.1-1+cuda11.6 amd64 TensorRT binaries
ii libnvinfer-dev 8.4.1-1+cuda11.6 amd64 TensorRT development libraries and headers
ii libnvinfer-plugin-dev 8.4.1-1+cuda11.6 amd64 TensorRT plugin libraries
ii libnvinfer-plugin8 8.4.1-1+cuda11.6 amd64 TensorRT plugin libraries
ii libnvinfer-samples 8.4.1-1+cuda11.6 all TensorRT samples
ii libnvinfer8 8.4.1-1+cuda11.6 amd64 TensorRT runtime libraries
ii libnvonnxparsers-dev 8.4.1-1+cuda11.6 amd64 TensorRT ONNX libraries
ii libnvonnxparsers8 8.4.1-1+cuda11.6 amd64 TensorRT ONNX libraries
ii libnvparsers-dev 8.4.1-1+cuda11.6 amd64 TensorRT parsers libraries
ii libnvparsers8 8.4.1-1+cuda11.6 amd64 TensorRT parsers libraries
ii python3-libnvinfer 8.4.1-1+cuda11.6 amd64 Python 3 bindings for TensorRT
ii uff-converter-tf 8.4.1-1+cuda11.6 amd64 UFF converter for TensorRT package

I definitely see a mis-match in versions between the TAO environment and the Deepstream environment.
I’m using TAO dockers:
FROM nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.4-py3

and I’m using Deepstream dockers:
FROM nvcr.io/nvidia/deepstream:6.1.1-devel

I thought these are the current versions. Am I wrong? Should I be using a different version?

just to add to my confusion… in the deepstream 6.1.1-devel container:
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Jun__8_16:49:14_PDT_2022
Cuda compilation tools, release 11.7, V11.7.99
Build cuda_11.7.r11.7/compiler.31442593_0

I see I’m on CUDA 11.7 (which didn’t match my TRT versions)

With your hints, I re-reaad the nvinfer documentation a couple more times :) Here is a combination that works.

Here is my config:

abbreviated preprocess config file
I see preprocess MUST be FP32 regardless of infer network mode
#- 0=FP32, 1=UINT8, 2=INT8, 3=UINT32, 4=INT32, 5=FP16
tensor-data-type=0

nvinfer config file
property:
gpu-id: 0
net-scale-factor: 0.00392156862745098
int8-calib-file: /project/deepstream/prod_model/dn2_rs18/calibration_qat.bin
#-works-warnings int8-calib-file: /project/deepstream/prod_model/dn2_rs18/calibration.tensor
#-works-warnings int8-calib-file: /project/deepstream/prod_model/dn2_rs18/calibration.bin
#2 model-engine-file: /project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.trt.int8
#3 model-engine-file: /project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.etlt
#4 model-engine-file: /project/deepstream/prod_model/dn2_rs18/resnet18_detector.etlt
#- tlt-encoded-model: /project/deepstream/prod_model/dn2_rs18/resnet18_detector.trt.int8
#- tlt-encoded-model: /project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.trt.int8
tlt-encoded-model: /project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.etlt
#- tlt-encoded-model: /project/deepstream/prod_model/dn2_rs18/resnet18_detector.etlt
labelfile-path: /project/deepstream/prod_model/dn2_rs18/labels.txt
network-mode: 1
batch-size: 24
process-mode: 1
offsets: 0.0;0.0;0.0
infer-dims: 3;384;1248
tlt-model-key: (my key)
network-type: 0
num-detected-classes: 3
uff-input-order: 0
output-blob-names: output_cov/Sigmoid;output_bbox/BiasAdd
uff-input-blob-name: input_1
model-color-format: 0
maintain-aspect-ratio: 0

this generates qat int8 engine but a bunch of these warnings (missing scale and zero point)

WARNING: [TRT]: Missing scale and zero-point for tensor output_cov/convolution, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor output_cov/bias, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor output_cov/BiasAdd, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
WARNING: [TRT]: Missing scale and zero-point for tensor output_cov/Sigmoid, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
0:00:58.796175281 14428 0x55e54a83ff60 INFO nvinfer gstnvinfer.cpp:646:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1955> [UID = 1]: serialize cuda engine to file: /project/deepstream/prod_model/dn2_rs18/resnet18_detector_qat.etlt_b24_gpu0_int8.engine successfully

Looks like I get the same warnings (and qty) whether I use

  • calibration_qat.bin
  • calibration.tensor
    and it doesn’t seem to matter

If i use NO calibration file, I get
WARNING: [TRT]: - Subnormal FP16 values detected.
WARNING: [TRT]: - Values less than smallest positive FP16 Subnormal value detected. Converting to FP16 minimum subnormalized value.
WARNING: [TRT]: If this is not the desired behavior, please modify the weights or retrain with regularization to reduce the magnitude of the weights.
WARNING: [TRT]: Weights [name=block_1a_conv_2/convolution + block_1a_conv_2/BiasAdd + block_1a_bn_2/batchnorm/mul_1 + block_1a_bn_2/batchnorm/add_1 + add_1/add + PWN(block_1a_relu/Relu6).weight] had the following issues when converted to FP16:

Still generates an INT8 model (but 50% bigger) and it runs.

So my final question is, how do I reduce these scale and zero-point warnings? This can’t be right. When I test the model in TAO, I get no errors/warnings.

How did you generate below calibration file?
int8-calib-file: /project/deepstream/prod_model/dn2_rs18/calibration_qat.bin

It was generated by the TAO toolkit (cv_samples_vv1.4.1)
detectnet_v2 jupyter notebook

Step 11. QAT Workflow

11a - retrains pruned model → QAT

tao detectnet_v2 train -e $SPECS_DIR/$RETRAIN_QAT_SPEC
-r $USER_EXPERIMENT_DIR/experiment_dir_retrain_qat
-k $KEY
-n resnet18_detector_pruned_qat
–gpus $NUM_GPUS

11b - evaluates

11c - exports

tao detectnet_v2 export
-m $USER_EXPERIMENT_DIR/experiment_dir_retrain_qat/weights/resnet18_detector_pruned_qat.tlt
-o $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector_qat.etlt
-k $KEY
-e $SPECS_DIR/$RETRAIN_QAT_SPEC
–data_type int8
–batch_size 64
–max_batch_size 64
–engine_file $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector_qat.trt.int8
–cal_cache_file $USER_EXPERIMENT_DIR/experiment_dir_final/calibration_qat.bin
–gen_ds_config
–verbose

(this is from the notebook - I made no revisions)

This is the output from the export task (11.c)

2022-09-07 17:33:23,837 [INFO] root: Registry: [‘nvcr.io’]
2022-09-07 17:33:23,896 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.4-py3
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-ru_rmy1s because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
Using TensorFlow backend.
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
/usr/local/lib/python3.6/dist-packages/requests/init.py:91: RequestsDependencyWarning: urllib3 (1.26.5) or chardet (3.0.4) doesn’t match a supported version!
RequestsDependencyWarning)
2022-09-07 17:33:29,668 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /workspace/cv_samples_vv1.4.1/detectnet_v2/specs/dn2_rs18_retrain_qat_acer0934.txt
2022-09-07 17:33:31,097 [INFO] iva.common.export.keras_exporter: Using input nodes: [‘input_1’]
2022-09-07 17:33:31,097 [INFO] iva.common.export.keras_exporter: Using output nodes: [‘output_cov/Sigmoid’, ‘output_bbox/BiasAdd’]
/usr/local/lib/python3.6/dist-packages/keras/engine/saving.py:292: UserWarning: No training configuration found in save file: the model was not compiled. Compile it manually.
warnings.warn('No training configuration found in save file: ’
NOTE: UFF has been tested with TensorFlow 1.14.0.
WARNING: The version of TensorFlow installed on this system is not guaranteed to work with UFF.
DEBUG [/usr/local/lib/python3.6/dist-packages/uff/converters/tensorflow/converter.py:96] Marking [‘output_cov/Sigmoid’, ‘output_bbox/BiasAdd’] as outputs
2022-09-07 17:38:52,921 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

Moving to tao forum.

Seems that you already trained a qat model.
Could you share the calibration_qat.bin ?

calibration_qat.bin (1.2 KB)

Thanks for your help. Sorry for the delay. I’ve been out of town. Attached is the calibration_qat.bin generated by the notebook (TAO) process.