NVIDIA Developer Forums

Cuda error for converted model from trtexec

Accelerated Computing Intelligent Video Analytics DeepStream SDK

936214531 November 24, 2024, 2:09pm 1

**• Hardware Platform (Jetson / GPU) GPU
**• DeepStream Version 7.1
**• TensorRT Version 10.6.0.26
**• NVIDIA GPU Driver Version (valid for GPU only) 560.35.03
**• Issue Type( questions, new requirements, bugs) bugs
**• How to reproduce the issue ?
Although the tensorrt version is beyond the requirement, the deepstream-app example works very well . However, When in the deepstream-app I load engine file which was converted from trtexec , there is a strange error:Failed to query video capabilities: Invalid argument
** INFO: <bus_callback:277>: Pipeline running

ERROR: [TRT]: IExecutionContext::enqueueV3: Error Code 1: CuTensor (Internal cuTensor permutate execute failed)
ERROR: [TRT]: [checkMacros.cpp::catchCudaError::212] Error Code 1: Cuda Runtime (an illegal memory access was encountered)
ERROR: nvdsinfer_backend.cpp:345 Failed to enqueue trt inference batch
ERROR: nvdsinfer_context_impl.cpp:1903 Infer context enqueue buffer failed, nvinfer error:NVDSINFER_TENSORRT_ERROR
0:00:01.650998612 29961 0x7fffd4001d60 WARN nvinfer gstnvinfer.cpp:2115:gst_nvinfer_process_tensor_input:<primary_gie1> error: Failed to queue input batch for inferencing
ERROR from primary_gie1: Failed to queue input batch for inferencing
Debug info: gstnvinfer.cpp(2115): gst_nvinfer_process_tensor_input (): /GstPipeline:pipeline/GstBin:primary_gie_bin1/GstNvInfer:primary_gie1
cuCtxCreate failed with error(700) gst_eglglessink_cuda_init
Cuda failure: status=700
Error(-1) in buffer allocation

** (Hysteroscopy:29961): CRITICAL **: 20:23:46.151: gst_nvds_buffer_pool_alloc_buffer: assertion ‘mem’ failed
ERROR from tiled_display_tiler: GstNvTiler: FATAL; Failed to allocate buffers

Debug info: gstnvtiler.cpp(622): gst_nvmultistreamtiler_decide_allocation (): /GstPipeline:pipeline/GstBin:tiled_display_bin/GstNvMultiStreamTiler:tiled_display_tiler
Cuda failure: status=700
Error(-1) in buffer allocation

** (Hysteroscopy:29961): CRITICAL **: 20:23:46.151: gst_nvds_buffer_pool_alloc_buffer: assertion ‘mem’ failed
ERROR from tiled_display_tiler: GstNvTiler: FATAL; Failed to allocate buffers

Debug info: gstnvtiler.cpp(622): gst_nvmultistreamtiler_decide_allocation (): /GstPipeline:pipeline/GstBin:tiled_display_bin/GstNvMultiStreamTiler:tiled_display_tiler
Cuda failure: status=700
Error(-1) in buffer allocation

** (Hysteroscopy:29961): CRITICAL **: 20:23:46.151: gst_nvds_buffer_pool_alloc_buffer: assertion ‘mem’ failed
ERROR from tiled_display_tiler: GstNvTiler: FATAL; Failed to allocate buffers

Debug info: gstnvtiler.cpp(622): gst_nvmultistreamtiler_decide_allocation (): /GstPipeline:pipeline/GstBin:tiled_display_bin/GstNvMultiStreamTiler:tiled_display_tiler
Cuda failure: status=700
Error(-1) in buffer allocation

** (Hysteroscopy:29961): CRITICAL **: 20:23:46.151: gst_nvds_buffer_pool_alloc_buffer: assertion ‘mem’ failed
ERROR from tiled_display_tiler: GstNvTiler: FATAL; Failed to allocate buffers

Debug info: gstnvtiler.cpp(622): gst_nvmultistreamtiler_decide_allocation (): /GstPipeline:pipeline/GstBin:tiled_display_bin/GstNvMultiStreamTiler:tiled_display_tiler
Cuda failure: status=700
Error(-1) in buffer allocation

** (Hysteroscopy:29961): CRITICAL **: 20:23:46.151: gst_nvds_buffer_pool_alloc_buffer: assertion ‘mem’ failed
ERROR from tiled_display_tiler: GstNvTiler: FATAL; Failed to allocate buffers

Debug info: gstnvtiler.cpp(622): gst_nvmultistreamtiler_decide_allocation (): /GstPipeline:pipeline/GstBin:tiled_display_bin/GstNvMultiStreamTiler:tiled_display_tiler
Cuda failure: status=700
Error(-1) in buffer allocation

** (Hysteroscopy:29961): CRITICAL **: 20:23:46.152: gst_nvds_buffer_pool_alloc_buffer: assertion ‘mem’ failed
ERROR from tiled_display_tiler: GstNvTiler: FATAL; Failed to allocate buffers

Debug info: gstnvtiler.cpp(622): gst_nvmultistreamtiler_decide_allocation (): /GstPipeline:pipeline/GstBin:tiled_display_bin/GstNvMultiStreamTiler:tiled_display_tiler
Cuda failure: status=700
Error(-1) in buffer allocation

** (Hysteroscopy:29961): CRITICAL **: 20:23:46.152: gst_nvds_buffer_pool_alloc_buffer: assertion ‘mem’ failed
ERROR from tiled_display_tiler: GstNvTiler: FATAL; Failed to allocate buffers

Debug info: gstnvtiler.cpp(622): gst_nvmultistreamtiler_decide_allocation (): /GstPipeline:pipeline/GstBin:tiled_display_bin/GstNvMultiStreamTiler:tiled_display_tiler

Another way to convert onnx file to tensorrt engine is to use
nvdsinfer_model_builder, and it works very well. But I still want to use trtexec because i can set precision for each layer. this is my trtexec command:
trtexec --explicitBatch --onnx=sim_cnn.onnx --saveEngine=model.trt --fp16 --precisionConstraints=obey --layerPrecisions=/downsample_layers.0/downsample_layers.0.1/ReduceMean_1:fp32,/downsample_layers.0/downsample_layers.0.1/ReduceMean:fp32,/downsample_layers.0/downsample_layers.0.1/Pow:fp32,/downsample_layers.1/downsample_layers.1.0/ReduceMean:fp32,/downsample_layers.1/downsample_layers.1.0/Pow:fp32,/downsample_layers.1/downsample_layers.1.0/ReduceMean_1:fp32,/downsample_layers.2/downsample_layers.2.0/ReduceMean:fp32,/downsample_layers.2/downsample_layers.2.0/Pow:fp32,/downsample_layers.2/downsample_layers.2.0/ReduceMean_1:fp32,/downsample_layers.3/downsample_layers.3.0/ReduceMean:fp32,/downsample_layers.3/downsample_layers.3.0/Pow:fp32,/downsample_layers.3/downsample_layers.3.0/ReduceMean_1:fp32

junshengy November 25, 2024, 11:48am 3

If you only specify the minShapes/optShapes/maxShapes parameters without specifying mixed precision, does it work properly?

Like below (please modify the parameters according to your model)

/usr/src/tensorrt/bin/trtexec --fp16 --minShapes=input:1x3x480x640 --optShapes=input:32x3x480x640 --maxShapes=input:32x3x480x640 --onnx='/model.onnx' --saveEngine='model_b16.plan' --workspace=7000

Try tensorrt 10.3 first, you can use docker to avoid the installation process

junshengy November 28, 2024, 3:53am 4

For the configuration of mixed precision of the model, you can refer to this link and make sure that the input layers are not modified.

github.com

NVIDIA-AI-IOT/deepstream_tao_apps/blob/master/configs/nvinfer/yolov3_tao/pgie_yolov3_tao_config.txt#L49


      
          num-detected-classes=4
          interval=0
          gie-unique-id=1
          is-classifier=0
          #network-type=0
          #no cluster
          cluster-mode=3
          output-blob-names=BatchedNMS
          parse-bbox-func-name=NvDsInferParseCustomBatchedNMSTLT
          custom-lib-path=../../../post_processor/libnvds_infercustomparser_tao.so
          layer-device-precision=cls/Sigmoid:fp32:gpu;cls/Sigmoid_1:fp32:gpu;box/Sigmoid_1:fp32:gpu;box/Sigmoid:fp32:gpu;cls/Reshape_reshape:fp32:gpu;box/Reshape_reshape:fp32:gpu;Transpose2:fp32:gpu;sm_reshape:fp32:gpu;encoded_sm:fp32:gpu;conv_big_object:fp32:gpu;cls/mul:fp32:gpu;box/concat_concat:fp32:gpu;box/add_1:fp32:gpu;box/mul_4:fp32:gpu;box/add:fp32:gpu;box/mul_6:fp32:gpu;box/sub_1:fp32:gpu;box/add_2:fp32:gpu;box/add_3:fp32:gpu;yolo_conv1_6:fp32:gpu;yolo_conv1_6_lrelu:fp32:gpu;yolo_conv2:fp32:gpu;Resize1:fp32:gpu;yolo_conv1_5_lrelu:fp32:gpu;encoded_bg:fp32:gpu;yolo_conv4_lrelu:fp32:gpu;yolo_conv4:fp32:gpu;
          
          [class-attrs-all]
          pre-cluster-threshold=0.3
          roi-top-offset=0
          roi-bottom-offset=0
          detected-min-w=0
          detected-min-h=0
          detected-max-w=0
          detected-max-h=0

yingliu December 30, 2024, 10:33am 5

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

system Closed January 13, 2025, 10:34am 6

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views	Activity
New installation Multiple Failues DeepStream SDK	18	1129	June 28, 2022
Meet errors during running the deepdream samlpes DeepStream SDK	2	660	February 18, 2022
Weird cuda error when using 1 stream instead of multiple DeepStream SDK tensorrt , cuda , gstreamer	5	717	February 11, 2023
ERROR from sink_sub_bin_encoder9: Device '/dev/nvhost-msenc' failed during initialization DeepStream SDK gstreamer	2	1035	October 12, 2021
Error from deepstream sample apps DeepStream SDK	9	1011	October 12, 2021
Python bindings sample apps cant run DeepStream SDK nvbugs	7	604	January 11, 2024
Deepstream app Segmentation fault. [Tried to release an unknown outputBatchID] DeepStream SDK cuda , gstreamer	8	1705	October 12, 2021
DeepStream API error DeepStream SDK	31	640	July 17, 2023
cuGraphicsGLRegisterBuffer failed with error(219) gst_eglglessink_cuda_init texture = 1 DeepStream SDK ubuntu	6	2441	February 1, 2022
Error in trtexec conversion model in deepstream container: Cuda failure: forward compatibility was attempted on non supported HW Aborted (core dumped) DeepStream SDK	3	305	July 1, 2023