I have trained a yolo_v4 model on TAO toolkit 4.0.1 with quantization set to true and exported the model to int8 and tried deploying the model in my python application which runs on deepstream 6.0. I am getting the following error on the console.
parseModel: Failed to parse ONNX model ERROR: tlt/tlt_decode.cpp:389 Failed to build network, error in model parsing.
following is the command I used for exporting the model
the infer-dims value I had given in the nvinfer was wrong I have updated this, this is the new nvinfer config d15_yolov3_resnet18_epoch_070.conf (4.4 KB). The following error
still persists, this is the debug log for the updated config 20230804.log (122.2 KB)
@adithya.ajith
Could you double check the .etlt model?
You export it and its name is d17_jul24_yolov4_resnet18_epoch_080_v4.1.etlt.
Could you set it to the ds config as well?
Currently, it is
The above is the file name for the model, in the logs
you can see ds loading the model layers. I had exported the model twice thinking it might have been some issue with the export command not running properly, I have shared the first export command and the model name is different the second time I ran export, sorry for the confusion caused.
Please check the input width and height again.
In your training spec, it is
output_width: 960
output_height: 544
But in ds config, it is not matching.
infer-dims=3;576;1024
More, to check if .etlt model and its key are correct, suggest you to run below experiment to narrow down.
After exporting, please check if tao-deploy yolo_v4 gen_trt_engine and tao-deploy yolo_v4 inference can work. Refer to GPU-optimized AI, Machine Learning, & HPC Software | NVIDIA NGC.
OK, to narrow down, please run below experiment to check if .etlt model and its key are correct.
After exporting, please check if tao-deploy yolo_v4 gen_trt_engine and tao-deploy yolo_v4 inference can work. Refer to GPU-optimized AI, Machine Learning, & HPC Software | NVIDIA NGC.
--cal_json_file: The path to the json file containing tensor scale for QAT models. This argument is required if engine for QAT model is being generated.
Note
When exporting a model trained with QAT enabled, the tensor scale factors to calibrate the activations are peeled out of the model and serialized to a JSON file defined by the cal_json_file argument.
Now the command runs without any error and the int8 trt engine is created. I ran tao yolo_v4 inference with the engine file and it runs and I am getting the inference on the frames. I guess the issue is not with the etlt file and the key is correct.