Run custom engine inside the plugin

• Hardware Platform (Jetson / GPU) T4
• DeepStream Version 5.1
• TensorRT Version 7.2
• NVIDIA GPU Driver Version (valid for GPU only) 460

Hi, I’m working on solution in DeepStream which detects people on video and anonymize them. As detector I use YoloV4 and then, based on bbox form yolo, I create mask with white elipses in the places where bboxes are.

Next I modify current frame by blurring places which are white in mask.
I create mask and modify frame in custom library for video template plugin and till I use opencv code for blurring everything works fine, but I want to make it faster.

To achive it, I created tensorRT engine to blur frame. As inputs it takes frame and mask and return modify frame. I modify custom library to work with engine but I get error in first inference. To run this engine I just added code for normal inference to library. Is it possible to make it work or should I connect with plugin for inference in DS?

Error:

../rtSafe/cuda/genericReformat.cu (1294) - Cuda Error in executeMemcpy: 1 (invalid argument)
FAILED_EXECUTION: std::exception
GPUassert: an illegal memory access was encountered src/modules/NvDCF/NvDCF.cpp 3891
ERROR: nvdsinfer_context_impl.cpp:1573 Failed to synchronize on cuda copy-coplete-event, cuda err_no:700, err_str:cudaErrorIllegalAddress
0:00:14.899472214 21913      0x352fd90 WARN                 nvinfer gstnvinfer.cpp:2021:gst_nvinfer_output_loop:<primary-inference> error: Failed to dequeue output from inferencing. NvDsInferContext error: NVDSINFER_CUDA_ERROR
0:00:14.899527117 21913      0x352fd90 WARN                 nvinfer gstnvinfer.cpp:616:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::releaseBatchOutput() <nvdsinfer_context_impl.cpp:1607> [UID = 1]: Tried to release an outputBatchID which is already with the context
Cuda failure: status=700 in CreateTextureObj at line 2902
nvbufsurftransform.cpp:2703: => Transformation Failed -2

Error: gst-stream-error-quark: Failed to dequeue output from inferencing. NvDsInferContext error: NVDSINFER_CUDA_ERROR (1): gstnvinfer.cpp(2021): gst_nvinfer_output_loop (): /GstPipeline:pipeline0/GstNvInfer:primary-inference
Segmentation fault (core dumped)

My pipeline:
uri_decode_bin → streammux → pgie → tracker → nvvideoconvert1 → videotemplate → nvosd → nvvideoconvert2 → capsfilter → avenc_mpeg4 → mpeg4videoparse → qtmux → filesink

Command to create engine:

trtexec --onnx=blend_blur_2inputs_up_to_1280.onnx --explicitBatch --fp16 --workspace=1024 --minShapes=image:220x220x3,mask:220x220 --optShapes=image:1280x1280x3,mask:1280x1280 --maxShapes=image:1280x1280x3,mask:1280x1280 --buildOnly --saveEngine=blendblur.engine

I will send you my onnx file and custom library file when you respone. Could you tell me if I implement something wrong or is it impossible to do it this way?

Please share files and workable command or app for a local repro.

I sent you my files in PM

Hi, I found the reason of the issue. I pushed uint data type to engine instead of float.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.