After ds6.2 -> 6.4 update, pyds import causes tensorrt to Abort (core dumped)

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) GPU: Tesla T4 (AWS’s g4dn-xlarge)
• DeepStream Version 6.4 (nvcr.io/nvidia/deepstream:6.4-samples-multiarch)
• TensorRT Version 8.6.1 (from nvcr.io/nvidia/deepstream:6.4-samples-multiarch)
• NVIDIA GPU Driver Version (valid for GPU only) 535.171.04
• Issue Type( questions, new requirements, bugs) bug
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing) build attached dockerfile, run the attached test within the docker container

Hi,
We’ve been looking into upgrading our deepstream 6.2 based solution to deepstream 6.4 (note, 7.0 was released in the middle of this upgrade effort). However, it seems that tensorrt doesn’t work when used together with deepstream within a python app, when pyds bindings are used.

I was able to make a minimal case which reproduces the bug, see attached dockerfile and trt-test.py. Running the trt-test.py results in:
Aborted (core dumped) when trying to context.get_binding_shape(i).
EDIT: removed irrelevant gdb-backtrace

Note - bug dissapears when we remove import pyds or from gi.repository import Gst, but this obviously is not a solution - our code which manages deepstream and dispatches GPU-resident buffers from deepstream into trt (and other places, incl. cupy ops) is a python app, so we need both of those bindings.

Note - attached dockerfile builds pyds itself, but when using precompiled pyds (from https://github.com/NVIDIA-AI-IOT/deepstream_python_apps/releases/download/v1.1.10/pyds-1.1.10-py3-none-linux_x86_64.whl), the issue still persists

Edit - just checked deepstream 7.0, it exhibits the same behaviour.

Everything works just fine in deepstream 6.2 and trt 8.5.2.2.

trt-test.py.log (995 Bytes)
ds64-trt-test.Dockerfile.log (1.9 KB)

Did you upgrade the driver and cuda versions on your host before upgrading the DeepStream?
You can refer to the link below to install the appropriate version DS_Installation.

Yes, I’ve upgraded them. Currently on the host I have:
Driver Version: 535.171.04 CUDA Version: 12.2 (as reported by nvidia-smi ran on the host)
Which AFAIK meet minimal requirements of both deepstream 6.4 and 7.0.

OK. Could you attach your model so that I can check that on my side?

I’ve used some standard mobilenet for testing, from here:
https://github.com/onnx/models/raw/main/validated/vision/classification/mobilenet/model/mobilenetv2-10.onnx
Note, the dockerfile which I’ve attached in the original post downloads it, so to replicate this on your side all you need to do (hopefully :-) ) is to run trt-test.py within this container.

I have tried your python file in our DeepStream 7.0 docker. There was no “Aborted” issue.

1. run our deepstream-7.0 docker
2.$ cd /opt/nvidia/deepstream/deepstream
3.$ ./user_deepstream_python_apps_install.sh --build-bindings -r master
4. Modify your code:
   model.get_tensor_name
   model.get_tensor_shape
5. python3 trt-test.py

I can confirm that changing from context.get_binding_shape(i) to model.get_tensor_shape(model.get_tensor_name(i)) works just fine.

Thanks.
I’ve wrongly assumed that these “deprecated” methods would still work.
I’ll just update my code to use these new ones.