Unable to generate tensorrt engine using ds-tao-detection app for yolov4_tiny for QAT trained etlt model

Deployment hardware specification

  • Hardware Platform : Jetson xavier nx
  • DeepStream Version : 6.2
  • JetPack Version : 5.1

• Issue : Like subject suggest i am unable to create int8 Engine for my jetson xavier device I have done follwing procedure to create my model.

  1. On Jetson.
  • Built TensorRT OSS on jetson.
  • built ds-tao-detection app.
  1. Model Details and Creation.
  • QAT enable trained .etlt model.
  • labels file.
  1. Hardware used for model creation
  • GeForce RTX 4070 Ti
  • Network Type (yolov4_tiny )
  • TLT Version (format_version: 2.0, toolkit_version: 4.0.1)

• my config file to run model is as follows:

[property]
gpu-id=0
net-scale-factor=1.0
offsets=103.939;116.779;123.68
model-color-format=1
labelfile-path=/home/nvidia/Downloads/export_qat/labels.txt
model-engine-file=/home/nvidia/Downloads/export_qat/yolov4_tiny_int8.engine
tlt-encoded-model=/home/nvidia/Downloads/deepstream_tlt_apps/post_processor/yolov4_cspdarknet_tiny_epoch_080.etlt
tlt-model-key=nvidia_tlt
infer-dims=3;640;640
maintain-aspect-ratio=0
output-tensor-meta=0
uff-input-order=0
uff-input-blob-name=Input
batch-size=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=0
num-detected-classes=5
interval=0
gie-unique-id=1
is-classifier=0
#network-type=0
cluster-mode=3
output-blob-names=BatchedNMS
parse-bbox-func-name=NvDsInferParseCustomBatchedNMSTLT
custom-lib-path=/home/nvidia/Downloads/deepstream_tlt_apps/post_processor/libnvds_infercustomparser_tao.so

[class-attrs-all]
pre-cluster-threshold=0.3
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

• How to reproduce the issue :
I run following command.

./ds-tao-detection -c /home/nvidia/Downloads/pgie_yolov4_tiny_tao_config.txt -i 1.mp4

• I get following error any help will be really appreciated.

nvidia@nvidia-desktop:~/Downloads/deepstream_tlt_apps/apps/tao_detection$ ./ds-tao-detection -c /home/nvidia/Downloads/pgie_yolov4_tiny_tao_config.txt -i 1.mp4 
Request sink_0 pad from streammux
batchSize 1...
Now playing: /home/nvidia/Downloads/pgie_yolov4_tiny_tao_config.txt
Opening in BLOCKING MODE 
WARNING: Deserialize engine failed because file path: /home/nvidia/Downloads/export_qat/yolov4_tiny_int8.engine open error
0:00:05.437904948 732312 0xaaaac96bbd30 WARN                 nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1897> [UID = 1]: deserialize engine from file :/home/nvidia/Downloads/export_qat/yolov4_tiny_int8.engine failed
0:00:05.523338857 732312 0xaaaac96bbd30 WARN                 nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2002> [UID = 1]: deserialize backend context from engine from file :/home/nvidia/Downloads/export_qat/yolov4_tiny_int8.engine failed, try rebuild
0:00:05.523459402 732312 0xaaaac96bbd30 INFO                 nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1923> [UID = 1]: Trying to create engine from model files
ERROR: [TRT]: 3: [builder.cpp::~Builder::307] Error Code 3: API Usage Error (Parameter check failed at: optimizer/api/builder.cpp::~Builder::307, condition: mObjectCounter.use_count() == 1. Destroying a builder object before destroying objects it created leads to undefined behavior.
)
NvDsInferCudaEngineGetFromTltModel: Failed to open TLT encoded model file /home/nvidia/Downloads/deepstream_tlt_apps/post_processor/yolov4_cspdarknet_tiny_epoch_080.etlt
ERROR: [TRT]: 3: [builder.cpp::~Builder::307] Error Code 3: API Usage Error (Parameter check failed at: optimizer/api/builder.cpp::~Builder::307, condition: mObjectCounter.use_count() == 1. Destroying a builder object before destroying objects it created leads to undefined behavior.
)
ERROR: Failed to create network using custom network creation function
ERROR: Failed to get cuda engine from custom library API
0:00:07.280273253 732312 0xaaaac96bbd30 ERROR                nvinfer gstnvinfer.cpp:674:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1943> [UID = 1]: build engine file failed
ERROR: [TRT]: 2: [logging.cpp::decRefCount::65] Error Code 2: Internal Error (Assertion mRefCount > 0 failed. )
corrupted size vs. prev_size while consolidating
Aborted (core dumped)

Also if there is any better way it will be really helpful.

Does this file “/home/nvidia/Downloads/deepstream_tlt_apps/post_processor/yolov4_cspdarknet_tiny_epoch_080.etlt” exist?

Yes it does,
also there is one more cal.json file too that i have generated but not used in config file also take into consideration that according to docs since model is trained with QAT enabled it is only possible to generate int8 model plz correct me if i am wrong in any way or missing anything.

{
    "tensor_scales": {
        "conv_0_mish/Relu6:0": 5.999761581420898,
        "conv_1_mish/Relu6:0": 5.999761581420898,
        "conv_2_conv_0_mish/Relu6:0": 5.999761581420898,
        "conv_2_split_0/strided_slice:0": 5.999783515930176,
        "conv_2_conv_1_mish/Relu6:0": 5.999761581420898,
        "conv_2_conv_2_mish/Relu6:0": 5.999761581420898,
        "conv_2_concat_0/concat:0": 5.999783515930176,
        "conv_2_conv_3_mish/Relu6:0": 5.999761581420898,
        "conv_2_concat_1/concat:0": 5.999783515930176,
        "conv_2_pool_0/MaxPool:0": 5.999799728393555,
        "conv_3_conv_0_mish/Relu6:0": 5.999761581420898,
        "conv_3_split_0/strided_slice:0": 5.999783515930176,
        "conv_3_conv_1_mish/Relu6:0": 5.999761581420898,
        "conv_3_conv_2_mish/Relu6:0": 5.999761581420898,
        "conv_3_concat_0/concat:0": 5.999783515930176,
        "conv_3_conv_3_mish/Relu6:0": 5.999761581420898,
        "conv_3_concat_1/concat:0": 5.999783515930176,
        "conv_3_pool_0/MaxPool:0": 5.999799728393555,
        "conv_4_conv_0_mish/Relu6:0": 5.999761581420898,
        "conv_4_split_0/strided_slice:0": 5.999783515930176,
        "conv_4_conv_1_mish/Relu6:0": 5.999761581420898,
        "conv_4_conv_2_mish/Relu6:0": 5.999761581420898,
        "conv_4_concat_0/concat:0": 5.999783515930176,
        "conv_4_conv_3_mish/Relu6:0": 5.999761581420898,
        "conv_4_concat_1/concat:0": 5.999783515930176,
        "conv_4_pool_0/MaxPool:0": 5.999799728393555,
        "conv_5_mish/Relu6:0": 5.999761581420898,
        "yolo_conv1_1_lrelu/Relu6:0": 5.999761581420898,
        "yolo_conv2_lrelu/Relu6:0": 5.995466232299805,
        "upsample0/transpose_1:0": 5.989624500274658,
        "concatenate_2/concat:0": 5.999783515930176,
        "yolo_conv1_6_lrelu/Relu6:0": 5.999761581420898,
        "yolo_conv3_6_lrelu/Relu6:0": 5.999761581420898,
        "Input": 151.05337524414062
    }
}
  1. could you share the output of “ll /home/nvidia/Downloads/deepstream_tlt_apps/post_processor/yolov4_cspdarknet_tiny_epoch_080.etlt”
  2. can you try to convert the model by tao-converter? for example, ./tao-converter -k nvidia_tlt -t fp32 -b 1 -p Input,1x3x640x640,1x3x640x640,1x3x640x640
    -e yolov4_cspdarknet_tiny_epoch_080.etlt.engine yolov4_cspdarknet_tiny_epoch_080.etlt
  • very noob mistake from my side extremely sorry it was wrong path here is my latest config file
[property]
gpu-id=0
net-scale-factor=1.0
offsets=103.939;116.779;123.68
model-color-format=1
labelfile-path=/home/nvidia/Downloads/export_qat/labels.txt
model-engine-file=/home/nvidia/Downloads/export_qat/yolov4_tiny_int8.engine
tlt-encoded-model=/home/nvidia/Downloads/export_qat/yolov4_cspdarknet_tiny_epoch_080.etlt
tlt-model-key=nvidia_tlt
infer-dims=3;640;640
maintain-aspect-ratio=0
output-tensor-meta=0
uff-input-order=0
uff-input-blob-name=Input
batch-size=1
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=0
num-detected-classes=5
interval=0
gie-unique-id=1
is-classifier=0
#network-type=0
cluster-mode=3
output-blob-names=BatchedNMS
parse-bbox-func-name=NvDsInferParseCustomBatchedNMSTLT
custom-lib-path=/home/nvidia/Downloads/deepstream_tlt_apps/post_processor/libnvds_infercustomparser_tao.so

[class-attrs-all]
pre-cluster-threshold=0.3
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

still i am getting error below is output

Request sink_0 pad from streammux
batchSize 1...
Now playing: /home/nvidia/Downloads/pgie_yolov4_tiny_tao_config.txt
Opening in BLOCKING MODE 
WARNING: Deserialize engine failed because file path: /home/nvidia/Downloads/export_qat/yolov4_tiny_int8.engine open error
0:00:17.209800487  5054 0xaaab16a19530 WARN                 nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1897> [UID = 1]: deserialize engine from file :/home/nvidia/Downloads/export_qat/yolov4_tiny_int8.engine failed
0:00:17.321318434  5054 0xaaab16a19530 WARN                 nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2002> [UID = 1]: deserialize backend context from engine from file :/home/nvidia/Downloads/export_qat/yolov4_tiny_int8.engine failed, try rebuild
0:00:17.321427459  5054 0xaaab16a19530 INFO                 nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1923> [UID = 1]: Trying to create engine from model files
ERROR: [TRT]: 3: [builder.cpp::~Builder::307] Error Code 3: API Usage Error (Parameter check failed at: optimizer/api/builder.cpp::~Builder::307, condition: mObjectCounter.use_count() == 1. Destroying a builder object before destroying objects it created leads to undefined behavior.
)
WARNING: [TRT]: onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: builtin_op_importers.cpp:5243: Attribute caffeSemantics not found in plugin node! Ensure that the plugin creator has a default value defined or the engine may fail to build.
ERROR: [TRT]: 4: [network.cpp::validate::3096] Error Code 4: Internal Error (Input: for dimension number 2 in profile 0 does not match network definition (got min=640, opt=640, max=640), expected min=opt=max=384).)
ERROR: Build engine failed from config file
Segmentation fault (core dumped)

It says expected min=opt=max=384 but my whole model is trained using images min=640, opt=640, max=640 so i am clueless why i am getting this error.
also whats use of .json file that i mentioned above

  1. when network-mode is 1, need to set int8-calib-file.
  2. can you try infer-dims=3;384:384? can engine be generated successfully?

I tried here is output

Request sink_0 pad from streammux
batchSize 1...
Now playing: /home/nvidia/Downloads/pgie_yolov4_tiny_tao_config.txt
Opening in BLOCKING MODE 
WARNING: Deserialize engine failed because file path: /home/nvidia/Downloads/export_qat/yolov4_tiny_int8.engine open error
0:00:04.157694938  5928 0xaaaaf9aa7530 WARN                 nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1897> [UID = 1]: deserialize engine from file :/home/nvidia/Downloads/export_qat/yolov4_tiny_int8.engine failed
0:00:04.253803837  5928 0xaaaaf9aa7530 WARN                 nvinfer gstnvinfer.cpp:677:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2002> [UID = 1]: deserialize backend context from engine from file :/home/nvidia/Downloads/export_qat/yolov4_tiny_int8.engine failed, try rebuild
0:00:04.253914142  5928 0xaaaaf9aa7530 INFO                 nvinfer gstnvinfer.cpp:680:gst_nvinfer_logger:<primary-nvinference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:1923> [UID = 1]: Trying to create engine from model files
ERROR: [TRT]: 3: [builder.cpp::~Builder::307] Error Code 3: API Usage Error (Parameter check failed at: optimizer/api/builder.cpp::~Builder::307, condition: mObjectCounter.use_count() == 1. Destroying a builder object before destroying objects it created leads to undefined behavior.
)
WARNING: [TRT]: onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
WARNING: [TRT]: builtin_op_importers.cpp:5243: Attribute caffeSemantics not found in plugin node! Ensure that the plugin creator has a default value defined or the engine may fail to build.
ERROR: [TRT]: 4: [network.cpp::validate::3096] Error Code 4: Internal Error (Input: for dimension number 3 in profile 0 does not match network definition (got min=384, opt=384, max=384), expected min=opt=max=1248).)
ERROR: Build engine failed from config file
Segmentation fault (core dumped)

can i try network-mode=0
and then use cal.json file will it work.

thanks for the update, deepstream will use tensorrt to convert the mdoel, wondering if it is related to tensorrt. can you try to convert the model by tao-converter ? for example, ./tao-converter -k nvidia_tlt -t fp32 -b 1 -p Input,1x3x640x640,1x3x640x640,1x3x640x640
-e yolov4_cspdarknet_tiny_epoch_080.etlt.engine yolov4_cspdarknet_tiny_epoch_080.etlt

my tensorrt is 8.5.2 i cant find suitable version for tao converter so i installed TAO using NGC and using following command

sudo apt-get install libssl-dev
export TRT_LIB_PATH=”/usr/lib/aarch64-linux-gnu”
export TRT_INC_PATH=”/usr/include/aarch64-linux-gnu”
sudo pip3 install nvidia-tao (first it was not taking path so sudo and no version mentioned so latest)

now when i run “tao -h” i get

/usr/lib/python3/dist-packages/requests/__init__.py:89: RequestsDependencyWarning: urllib3 (2.0.2) or chardet (3.0.4) doesn't match a supported version!
  warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
~/.tao_mounts.json wasn't found. Falling back to obtain mount points and docker configs from ~/.tlt_mounts.json.
Please note that this will be deprecated going forward.
usage: tao [-h]
           {list,stop,info,action_recognition,augment,bpnet,classification_tf1,classification_tf2,converter,deformable_detr,detectnet_v2,dssd,efficientdet_tf1,efficientdet_tf2,emotionnet,faster_rcnn,fpenet,gazenet,gesturenet,heartratenet,intent_slot_classification,lprnet,mask_rcnn,multitask_classification,n_gram,pointpillars,pose_classification,punctuation_and_capitalization,question_answering,re_identification,retinanet,segformer,spectro_gen,speech_to_text,speech_to_text_citrinet,speech_to_text_conformer,ssd,text_classification,token_classification,unet,vocoder,yolo_v3,yolo_v4,yolo_v4_tiny}
           ...

Launcher for TAO Toolkit.

optional arguments:
  -h, --help            show this help message and exit

tasks:
  {list,stop,info,action_recognition,augment,bpnet,classification_tf1,classification_tf2,converter,deformable_detr,detectnet_v2,dssd,efficientdet_tf1,efficientdet_tf2,emotionnet,faster_rcnn,fpenet,gazenet,gesturenet,heartratenet,intent_slot_classification,lprnet,mask_rcnn,multitask_classification,n_gram,pointpillars,pose_classification,punctuation_and_capitalization,question_answering,re_identification,retinanet,segformer,spectro_gen,speech_to_text,speech_to_text_citrinet,speech_to_text_conformer,ssd,text_classification,token_classification,unet,vocoder,yolo_v3,yolo_v4,yolo_v4_tiny}

and can not run tao-converter says no command found.

you can download it from TAO Converter | NVIDIA NGC

when i run

./tao-converter -k nvidia_tlt -t fp32 -b 1 -p Input,1x3x640x640,1x3x640x640,1x3x640x640 -e yolov4_cspdarknet_tiny_epoch_080.etlt.engine yolov4_cspdarknet_tiny_epoch_080.etlt


[INFO] [MemUsageChange] Init CUDA: CPU +188, GPU +0, now: CPU 216, GPU 4596 (MiB)
[INFO] [MemUsageChange] Init builder kernel library: CPU +106, GPU +102, now: CPU 344, GPU 4718 (MiB)
[INFO] ----------------------------------------------------------------
[INFO] Input filename:   /tmp/fileaU16gP
[INFO] ONNX IR version:  0.0.7
[INFO] Opset version:    12
[INFO] Producer name:    
[INFO] Producer version: 
[INFO] Domain:           
[INFO] Model version:    0
[INFO] Doc string:       
[INFO] ----------------------------------------------------------------
[WARNING] onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[INFO] No importer registered for op: BatchedNMSDynamic_TRT. Attempting to import as plugin.
[INFO] Searching for plugin: BatchedNMSDynamic_TRT, plugin_version: 1, plugin_namespace: 
[WARNING] builtin_op_importers.cpp:5243: Attribute caffeSemantics not found in plugin node! Ensure that the plugin creator has a default value defined or the engine may fail to build.
[INFO] Successfully created plugin: BatchedNMSDynamic_TRT
[INFO] Detected input dimensions from the model: (-1, 3, 384, 1248)
[INFO] Model has dynamic shape. Setting up optimization profiles.
[INFO] Using optimization profile min shape: (1, 3, 640, 640) for input: Input
[INFO] Using optimization profile opt shape: (1, 3, 640, 640) for input: Input
[INFO] Using optimization profile max shape: (1, 3, 640, 640) for input: Input
[ERROR] 4: [network.cpp::validate::3096] Error Code 4: Internal Error (Input: for dimension number 2 in profile 0 does not match network definition (got min=640, opt=640, max=640), expected min=opt=max=384).)
[ERROR] Unable to create engine
Segmentation fault (core dumped)

also my tensorrt version is 8.5.2 for which i did not find any arm64 tao-converter instead i used version
v3.21.11_trt8.0_aarch64
am i getting error because of this.

from the log, the dimension is 384x1248, can you try ./tao-converter -k nvidia_tlt -t fp32 -b 1 -p Input,1x3x384x1248,1x3x384x1248,1x3x384x1248
-e yolov4_cspdarknet_tiny_epoch_080.etlt.engine yolov4_cspdarknet_tiny_epoch_080.etlt?

ok i will try it, but just now i tried with version v3.22.05_trt8.4_aarch64 and i got following output

./tao-converter8.4 -k nvidia_tlt -t fp32 -b 1 -p Input,1x3x640x640,1x3x640x640,1x3x640x640 -e yolov4_cspdarknet_tiny_epoch_080.etlt.engine yolov4_cspdarknet_tiny_epoch_080.etlt -u 0


[INFO] [MemUsageChange] Init CUDA: CPU +188, GPU +0, now: CPU 216, GPU 4493 (MiB)
[INFO] [MemUsageChange] Init builder kernel library: CPU +106, GPU +133, now: CPU 344, GPU 4648 (MiB)
[INFO] ----------------------------------------------------------------
[INFO] Input filename:   /tmp/filexbMavf
[INFO] ONNX IR version:  0.0.7
[INFO] Opset version:    12
[INFO] Producer name:    
[INFO] Producer version: 
[INFO] Domain:           
[INFO] Model version:    0
[INFO] Doc string:       
[INFO] ----------------------------------------------------------------
[WARNING] onnx2trt_utils.cpp:375: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[WARNING] onnx2trt_utils.cpp:403: One or more weights outside the range of INT32 was clamped
[INFO] No importer registered for op: BatchedNMSDynamic_TRT. Attempting to import as plugin.
[INFO] Searching for plugin: BatchedNMSDynamic_TRT, plugin_version: 1, plugin_namespace: 
[WARNING] builtin_op_importers.cpp:5243: Attribute caffeSemantics not found in plugin node! Ensure that the plugin creator has a default value defined or the engine may fail to build.
[INFO] Successfully created plugin: BatchedNMSDynamic_TRT
[INFO] Detected input dimensions from the model: (-1, 3, 384, 1248)
[INFO] Model has dynamic shape. Setting up optimization profiles.
[INFO] Using optimization profile min shape: (1, 3, 640, 640) for input: Input
[INFO] Using optimization profile opt shape: (1, 3, 640, 640) for input: Input
[INFO] Using optimization profile max shape: (1, 3, 640, 640) for input: Input
[ERROR] DLA execution was requested for the network using setDefaultDeviceType, but neither FP16 or Int8 mode is enabled
[ERROR] 4: [network.cpp::validate::2789] Error Code 4: Internal Error (DLA validation failed)
[ERROR] Unable to create engine
Segmentation fault (core dumped)

as you suggest i will try but i think its more of version error, is there a way i can get tao converter for tensorrt 8.5.2.2 and i think it may work

ok its running but whats confusing is how can it be 384x1248 my model was trained on 640 x 640 will update ASAP as i get output.

ok successfully created engine thank you very much will check the results.

But i still have some questions mainly


1.  how the dimension can be 384x1248 as my whole model was trained and even i tested inferece on my gpu with 640x640?
2. tao-converter has -u option for DLA can i use them since mine is jetson xavier NX device and if yes what is the core index.
3. Also i suppose cal.bin file should get created right? but i didn't get any.

once again thanks @fanzh @yingliu for great help and support .

  1. pleae check the training code.
  2. find enable-dla, use-dla-core in nvinfer

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.