Cuda error for converted model from trtexec

**• Hardware Platform (Jetson / GPU) GPU
**• DeepStream Version 7.1
**• TensorRT Version 10.6.0.26
**• NVIDIA GPU Driver Version (valid for GPU only) 560.35.03
**• Issue Type( questions, new requirements, bugs) bugs
**• How to reproduce the issue ?
Although the tensorrt version is beyond the requirement, the deepstream-app example works very well . However, When in the deepstream-app I load engine file which was converted from trtexec , there is a strange error:Failed to query video capabilities: Invalid argument
** INFO: <bus_callback:277>: Pipeline running

ERROR: [TRT]: IExecutionContext::enqueueV3: Error Code 1: CuTensor (Internal cuTensor permutate execute failed)
ERROR: [TRT]: [checkMacros.cpp::catchCudaError::212] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
ERROR: nvdsinfer_backend.cpp:345 Failed to enqueue trt inference batch
ERROR: nvdsinfer_context_impl.cpp:1903 Infer context enqueue buffer failed, nvinfer error:NVDSINFER_TENSORRT_ERROR
0:00:01.650998612 29961 0x7fffd4001d60 WARN nvinfer gstnvinfer.cpp:2115:gst_nvinfer_process_tensor_input:<primary_gie1> error: Failed to queue input batch for inferencing
ERROR from primary_gie1: Failed to queue input batch for inferencing
Debug info: gstnvinfer.cpp(2115): gst_nvinfer_process_tensor_input (): /GstPipeline:pipeline/GstBin:primary_gie_bin1/GstNvInfer:primary_gie1
cuCtxCreate failed with error(700) gst_eglglessink_cuda_init
Cuda failure: status=700
Error(-1) in buffer allocation

** (Hysteroscopy:29961): CRITICAL **: 20:23:46.151: gst_nvds_buffer_pool_alloc_buffer: assertion ‘mem’ failed
ERROR from tiled_display_tiler: GstNvTiler: FATAL; Failed to allocate buffers

Debug info: gstnvtiler.cpp(622): gst_nvmultistreamtiler_decide_allocation (): /GstPipeline:pipeline/GstBin:tiled_display_bin/GstNvMultiStreamTiler:tiled_display_tiler
Cuda failure: status=700
Error(-1) in buffer allocation

** (Hysteroscopy:29961): CRITICAL **: 20:23:46.151: gst_nvds_buffer_pool_alloc_buffer: assertion ‘mem’ failed
ERROR from tiled_display_tiler: GstNvTiler: FATAL; Failed to allocate buffers

Debug info: gstnvtiler.cpp(622): gst_nvmultistreamtiler_decide_allocation (): /GstPipeline:pipeline/GstBin:tiled_display_bin/GstNvMultiStreamTiler:tiled_display_tiler
Cuda failure: status=700
Error(-1) in buffer allocation

** (Hysteroscopy:29961): CRITICAL **: 20:23:46.151: gst_nvds_buffer_pool_alloc_buffer: assertion ‘mem’ failed
ERROR from tiled_display_tiler: GstNvTiler: FATAL; Failed to allocate buffers

Debug info: gstnvtiler.cpp(622): gst_nvmultistreamtiler_decide_allocation (): /GstPipeline:pipeline/GstBin:tiled_display_bin/GstNvMultiStreamTiler:tiled_display_tiler
Cuda failure: status=700
Error(-1) in buffer allocation

** (Hysteroscopy:29961): CRITICAL **: 20:23:46.151: gst_nvds_buffer_pool_alloc_buffer: assertion ‘mem’ failed
ERROR from tiled_display_tiler: GstNvTiler: FATAL; Failed to allocate buffers

Debug info: gstnvtiler.cpp(622): gst_nvmultistreamtiler_decide_allocation (): /GstPipeline:pipeline/GstBin:tiled_display_bin/GstNvMultiStreamTiler:tiled_display_tiler
Cuda failure: status=700
Error(-1) in buffer allocation

** (Hysteroscopy:29961): CRITICAL **: 20:23:46.151: gst_nvds_buffer_pool_alloc_buffer: assertion ‘mem’ failed
ERROR from tiled_display_tiler: GstNvTiler: FATAL; Failed to allocate buffers

Debug info: gstnvtiler.cpp(622): gst_nvmultistreamtiler_decide_allocation (): /GstPipeline:pipeline/GstBin:tiled_display_bin/GstNvMultiStreamTiler:tiled_display_tiler
Cuda failure: status=700
Error(-1) in buffer allocation

** (Hysteroscopy:29961): CRITICAL **: 20:23:46.152: gst_nvds_buffer_pool_alloc_buffer: assertion ‘mem’ failed
ERROR from tiled_display_tiler: GstNvTiler: FATAL; Failed to allocate buffers

Debug info: gstnvtiler.cpp(622): gst_nvmultistreamtiler_decide_allocation (): /GstPipeline:pipeline/GstBin:tiled_display_bin/GstNvMultiStreamTiler:tiled_display_tiler
Cuda failure: status=700
Error(-1) in buffer allocation

** (Hysteroscopy:29961): CRITICAL **: 20:23:46.152: gst_nvds_buffer_pool_alloc_buffer: assertion ‘mem’ failed
ERROR from tiled_display_tiler: GstNvTiler: FATAL; Failed to allocate buffers

Debug info: gstnvtiler.cpp(622): gst_nvmultistreamtiler_decide_allocation (): /GstPipeline:pipeline/GstBin:tiled_display_bin/GstNvMultiStreamTiler:tiled_display_tiler

Another way to convert onnx file to tensorrt engine is to use
nvdsinfer_model_builder, and it works very well. But I still want to use trtexec because i can set precision for each layer. this is my trtexec command:
trtexec --explicitBatch --onnx=sim_cnn.onnx --saveEngine=model.trt --fp16 --precisionConstraints=obey --layerPrecisions=/downsample_layers.0/downsample_layers.0.1/ReduceMean_1:fp32,/downsample_layers.0/downsample_layers.0.1/ReduceMean:fp32,/downsample_layers.0/downsample_layers.0.1/Pow:fp32,/downsample_layers.1/downsample_layers.1.0/ReduceMean:fp32,/downsample_layers.1/downsample_layers.1.0/Pow:fp32,/downsample_layers.1/downsample_layers.1.0/ReduceMean_1:fp32,/downsample_layers.2/downsample_layers.2.0/ReduceMean:fp32,/downsample_layers.2/downsample_layers.2.0/Pow:fp32,/downsample_layers.2/downsample_layers.2.0/ReduceMean_1:fp32,/downsample_layers.3/downsample_layers.3.0/ReduceMean:fp32,/downsample_layers.3/downsample_layers.3.0/Pow:fp32,/downsample_layers.3/downsample_layers.3.0/ReduceMean_1:fp32

If you only specify the minShapes/optShapes/maxShapes parameters without specifying mixed precision, does it work properly?

Like below (please modify the parameters according to your model)

/usr/src/tensorrt/bin/trtexec --fp16 --minShapes=input:1x3x480x640 --optShapes=input:32x3x480x640 --maxShapes=input:32x3x480x640 --onnx='/model.onnx' --saveEngine='model_b16.plan' --workspace=7000

Try tensorrt 10.3 first, you can use docker to avoid the installation process

For the configuration of mixed precision of the model, you can refer to this link and make sure that the input layers are not modified.