Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU)
2060
• DeepStream Version
deepstream 7.0
• JetPack Version (valid for Jetson only)
• TensorRT Version
in deepstream7’s container
• NVIDIA GPU Driver Version (valid for GPU only)
535
• Issue Type( questions, new requirements, bugs)
questions
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)
I want to use pytorch-quantization to perform classification quantization of Deepstream7, which can be done normally in deepstream6. The process is to use torch-tensorrt==1.4.0 and pytorch-quantization==2.1.3, then export to jit file, and export to int8 quantization result of trt.
This step now encounters a problem. Its trt file has an error in the secondary classification of deepstream:
ERROR: [TRT]: 1: [runtime.cpp::parsePlan::314] Error Code 1: Serialization (Serialization assertion plan->header.magicTag == rt::kPLAN_MAGIC_TAG failed.)
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:1540 Deserialize engine failed from file: /root/ai/weights/convnext_tiny.in12k_ft_in1k_bird2683_16_int8.trt
0:00:09.422453039 587 0x5bf6fbc35240 WARN nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<secondary_gie_0> NvDsInferContext[UID 5]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2083> [UID = 5]: deserialize engine from file :/root/ai/weights/convnext_tiny.in12k_ft_in1k_bird2683_16_int8.trt failed
0:00:09.616871719 587 0x5bf6fbc35240 WARN nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<secondary_gie_0> NvDsInferContext[UID 5]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2188> [UID = 5]: deserialize backend context from engine from file :/root/ai/weights/convnext_tiny.in12k_ft_in1k_bird2683_16_int8.trt failed, try rebuild
0:00:09.616900305 587 0x5bf6fbc35240 INFO nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<secondary_gie_0> NvDsInferContext[UID 5]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2109> [UID = 5]: Trying to create engine from model files
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:870 failed to build network since there is no model file matched.
ERROR: ../nvdsinfer/nvdsinfer_model_builder.cpp:809 failed to build network.
0:00:17.749614745 587 0x5bf6fbc35240 ERROR nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<secondary_gie_0> NvDsInferContext[UID 5]: Error in NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2129> [UID = 5]: build engine file failed
0:00:17.950502275 587 0x5bf6fbc35240 ERROR nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<secondary_gie_0> NvDsInferContext[UID 5]: Error in NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2215> [UID = 5]: build backend context failed
0:00:17.950534728 587 0x5bf6fbc35240 ERROR nvinfer gstnvinfer.cpp:676:gst_nvinfer_logger:<secondary_gie_0> NvDsInferContext[UID 5]: Error in NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1352> [UID = 5]: generate backend failed, check config file settings
0:00:17.950601441 587 0x5bf6fbc35240 WARN nvinfer gstnvinfer.cpp:912:gst_nvinfer_start:<secondary_gie_0> error: Failed to create NvDsInferContext instance
0:00:17.950608794 587 0x5bf6fbc35240 WARN nvinfer gstnvinfer.cpp:912:gst_nvinfer_start:<secondary_gie_0> error: Config file path: /root/ai/incarai2024/configs/config_infer_secondary_bird.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
** ERROR: <init_deep_stream:1699>: Failed to set pipeline to PAUSED
But this trt file works in the following python code:
# Load and deserialize TensorRT engine
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
with open(trt_file_path, "rb") as f:
engine_data = f.read()
runtime = trt.Runtime(TRT_LOGGER)
engine = runtime.deserialize_cuda_engine(engine_data)
# Create execution context
context = engine.create_execution_context()
# Classification label
labels = classes
# Test classification accuracy and inference performance
total_images = 0
correct_predictions = 0
total_time = 0.0
# Allocate GPU memory
d_input = cuda.mem_alloc(batch_size * 3 * img_size * img_size * 4) # Input size is batch_size * 3 * img_size * img_size * 4 (float32)
d_output = cuda.mem_alloc(batch_size * num_classes * 4) # Output is batch_size * num_classes * 4 (float32)
# Create CUDA stream
stream = cuda.Stream()
I tried to upgrade pytorch-quantization==2.2.1, but the problem was even bigger. I couldn’t find the corresponding pytorch version and pytorch_tensorrt version at all. What should I do?