Deepstream Python error - Custom Parse Function not found on InstanceSegment-postprocessor

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) Jetson Orin
• DeepStream Version 6.1
• JetPack Version (valid for Jetson only) 5.01
• TensorRT Version 8.4
• NVIDIA GPU Driver Version (valid for GPU only) 11.4
• Issue Type( questions, new requirements, bugs) question
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

Hi

I created an MRCNN custom model using the TAO toolkit. The model trained and produced good results. I exported the model and then on Orin I converted successfully and produced the engine file.

I also built the plugin module using the TensorRT instructions (including the updated version of CMake). During this process I noticed the following:

  1. I used a GPU capability of 87 for the Orin machine and obtained this from the device query:

./deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: “Orin”
CUDA Driver Version / Runtime Version 11.4 / 11.4
CUDA Capability Major/Minor version number: 8.7
Total amount of global memory: 30623 MBytes (32110190592 bytes)
(016) Multiprocessors, (128) CUDA Cores/MP: 2048 CUDA Cores
GPU Max Clock rate: 1300 MHz (1.30 GHz)
Memory Clock rate: 1300 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 4194304 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total shared memory per multiprocessor: 167936 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 1536
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: Yes
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Managed Memory: Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 0 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.4, CUDA Runtime Version = 11.4, NumDevs = 1
Result = PASS

  1. Building the updated plugin lib did not create the “out” directory, but did compile the library libnvinfer_plugin.so.8.4.1, which is the file i used to copy to /usr/lib/aarch64-linux…

Running ll libnvinfer_plugin* yields the following:

lrwxrwxrwx 1 root root 26 Jul 10 09:07 libnvinfer_plugin.so → libnvinfer_plugin.so.8.4.1*
lrwxrwxrwx 1 root root 26 Jul 9 20:27 libnvinfer_plugin.so.8 → libnvinfer_plugin.so.8.4.1*
-rwxr-xr-x 1 root root 26739504 Jul 10 10:41 libnvinfer_plugin.so.8.4.1*
-rw-r–r-- 1 root root 30558374 Apr 30 18:43 libnvinfer_plugin_static.a

Which I believe is correct.

My problem is that when I try to use the engine (in a Python pipeline used before) I get the following error:

ERROR nvinfer gstnvinfer.cpp:640:gst_nvinfer_logger: NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::initResource() <nvdsinfer_context_impl.cpp:862> [UID = 1]: Custom parse function not found for InstanceSegment-postprocessor

I am able to successfully run the deepstream_tao sample apps and also ran trtexec on the engine file and that passed.

Any help gratefully received - cheers

from the error, you did not set parse-bbox-instance-mask-func-name ,please find it in Gst-nvinfer — DeepStream 6.1 Release documentation

Thank you @fanzh

Actually I have that entry in my config file:

[property]
gpu-id=0
model-color-format=0
net-scale-factor=0.017507
parse-bbox-instance-mask-func-name=NvDsInferParseCustomMrcnnTLT
network-type=3
#tlt-encoded-model=…/Models/save/model.etlt
model-engine-file=…/Models/save/model.etlt_b1_gpu0_int8.engine
labelfile-path=…/Models/save/labels.txt
int8-calib-file=…/Models/save/maskrcnn.bin
offsets=123.675;116.28;103.53
infer-dims=3;1024;1024
tlt-model-key=nvidia_tlt
network-type=3
num-detected-classes=2
uff-input-order=0
output-blob-names=generate_detections;mask_fcn_logits/BiasAdd
uff-input-blob-name=Input
model-color-format=0
maintain-aspect-ratio=0
output-tensor-meta=0
num-detected-classes=1
batch-size=1

0=FP32, 1=INT8, 2=FP16 mode

network-mode=1
interval=0
gie-unique-id=1
#no cluster

0=Group Rectangles, 1=DBSCAN, 2=NMS, 3= DBSCAN+NMS Hybrid, 4 = None(No clustering)

MRCNN supports only cluster-mode=4; Clustering is done by the model itself

cluster-mode=4
output-instance-mask=1

[class-attrs-all]
pre-cluster-threshold=0.8

Do I have the correct lib and installed correctly? (libnvinfer_plugin.so.8.4.1)

you did not set custom-lib-path, please refer to deepstream sample deepstream-mrcnn-test.
and nvinfer plugin is opensource, you can add logs to debug, you can find that error in function InstanceSegmentPostprocessor::initResource.

thak you @fanzh , I ended up using the TAO specific plugin together with the TLT2 bbox function and all was good. I’ve marked this as the solution.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.