Deepstream with python bindings

Description

A clear and concise description of the bug or issue.

I try to get Deepstream to run on AGX Orin. Once installed with SDK Manager and running the below described steps of packages (torch, torchvision and cloning deepstream-python-app) I get the below error when trying to run the deepstream_test1.py

Traceback (most recent call last):
  File "deepstream_test_1.py", line 26, in <module>
    from common.platform_info import PlatformInfo
  File "/opt/nvidia/deepstream/deepstream-6.3/sources/deepstream_python_apps/apps/deepstream-test1/../common/platform_info.py", line 21, in <module>
    from cuda import cudart
ModuleNotFoundError: No module named 'cuda'

Environment

TensorRT Version:
Installed from SDK Manager, Jetpack 5.1.2, DS 6.3

GPU Type:
Jetson Orin Dev Kit 64
Nvidia Driver Version:

CUDA Version:
11.4

CUDNN Version:
matching from SDK Manager

Operating System + Version:
20.04

Python Version (if applicable):

TensorFlow Version (if applicable):

PyTorch Version (if applicable):
2.1.0a0+41361538.nv23.06
Torchvision: 0.16.0

Baremetal or Container (if container which image + tag):

Steps To Reproduce

sudo apt install -y libgstrtspserver-1.0-0
sudo apt install -y libjansson4

sudo apt install -y python3-gst-1.0
sudo apt install -y python-gi-dev
sudo apt install -y git libglib2.0-dev
sudo apt install -y libgirepository1.0-dev
sudo apt install -y libcairo2-dev
sudo apt install -y python3-pip

sudo pip install virtualenv

export TORCH_INSTALL=https://developer.download.nvidia.com/compute/redist/jp/v512/pytorch/torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl
python3 -m pip install --no-cache $TORCH_INSTALL

sudo apt-get install -y libjpeg-dev
sudo apt-get install -y zlib1g-dev
sudo apt-get install -y libpython3-dev
sudo apt-get install -y libopenblas-dev
sudo apt-get install -y libavcodec-dev
sudo apt-get install -y libavformat-dev
sudo apt-get install -y libswscale-dev

git clone --branch v0.16.1 https://github.com/pytorch/vision torchvision
cd torchvision
export BUILD_VERSION=0.16.0  # where 0.x.0 is the torchvision version  
python3 setup.py install --user

export PYDS_INSTALL=https://github.com/NVIDIA-AI-IOT/deepstream_python_apps/releases/download/v1.1.8/pyds-1.1.8-py3-none-linux_aarch64.whl
python3 -m pip install --no-cache $PYDS_INSTALL

cd /opt/nvidia/deepstream/deepstream-6.3/sources
git clone https://github.com/NVIDIA-AI-IOT/deepstream_python_apps
cd ~

python3 -c "import torchvision as tv; print(tv.__version__); import torch as t; print(t.__version__)"
python3 -c "import torch; print('Is Cuda support available: ' + str(torch.cuda.is_available()))"

mkdir Development
sudo apt install -y libcanberra-gtk-module 
sudo apt install -y libcanberra-gtk3-module

sudo apt-get update
sudo apt-get install -y v4l-utils

sudo apt install -y terminator
pip3 install tqdm

gst-launch-1.0 v4l2src device=/dev/video0 ! 'video/x-raw , width=(int)1280 , height=(int)720, format=(string)GRAY8 , framerate=(fraction)60/1' ! nvvidconv ! 'video/x-raw(memory:NVMM),  width=(int)1280, height=(int)720, format=I420' ! nvvidconv ! 'video/x-raw, format=BGRx' !  videoconvert ! 'video/x-raw, format=BGR, width=(int)640 , height=(int)360' ! videoconvert ! xvimagesink -e

Some of the steps are not needed but I run it to ensure that it is installed.

STEPS:
Once the installation above is complete, with PYDS 1.1.8 installation I go to the cloned deepstream_python_apps (Latest)

I run
git submodule update --init

This does not fail.

Still cant run the deepstream-test1.py. Always complains about CUDA.


./deviceQuery 
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "Orin"
  CUDA Driver Version / Runtime Version          11.4 / 11.4
  CUDA Capability Major/Minor version number:    8.7
  Total amount of global memory:                 62800 MBytes (65851035648 bytes)
  (008) Multiprocessors, (128) CUDA Cores/MP:    1024 CUDA Cores
  GPU Max Clock rate:                            1300 MHz (1.30 GHz)
  Memory Clock rate:                             612 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 4194304 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total shared memory per multiprocessor:        167936 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1536
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            Yes
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Managed Memory:                Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      Yes
  Device PCI Domain ID / Bus ID / location ID:   0 / 0 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.4, CUDA Runtime Version = 11.4, NumDevs = 1
Result = PASS

import torch
>>> torch.cuda.is_available() 
True

I think we faced the same issues, but I am containerizing it using docker.

@DaneLLL Any Ideas? Ive seen that you have been able to support in similar matters.

Update:
I ran in terminal:

pip3 install Cython
pip3 install pycuda --user
pip3 install cyda-python

This adds the dependecy needed for cudaand cudart.
BUT new error:

python3 deepstream_test_1.py /opt/nvidia/deepstream/deepstream-6.3/samples/streams/sample_720p
...
...
Playing file /opt/nvidia/deepstream/deepstream-6.3/samples/streams/sample_720p.h264 
Adding elements to Pipeline 

Linking elements in the Pipeline 

Traceback (most recent call last):
  File "deepstream_test_1.py", line 263, in <module>
    sys.exit(main(sys.argv))
  File "deepstream_test_1.py", line 225, in main
    sinkpad = streammux.request_pad_simple("sink_0")
AttributeError: 'GstNvStreamMux' object has no attribute 'request_pad_simple'

I think I found the issue did you pull the latest deepstream python apps from github? make sure you clone the appropriate version that fit the deepstream version you are using

I cloned 1.1.8 to match the PYDS.
So I also tried the latest.

did you get it to work without installing cuda-python?

yes i switched to the branch that match ds6.4 it worked for me without installing cuda-python

Can you provide the installation steps to get it to work with 6.4?

I must forget something in my installation or install in the wrong order. Do you SUDO install everything or
only when needed. My problem has been that the user is not allowed to apt install without sudo.

I start with flashing for scratch with SDK Manager.
What do you do after that?

Moving to deepstream forum for better support, thanks.

[Installation — DeepStream documentation 6.4 documentation]
I dont think they change the version requirement but just make sure to install 6.4.deb

How to follow that?

Did you Migrate glib to newer version and

 git clone https://github.com/GNOME/glib.git
 cd glib
 git checkout <glib-version-branch>

 meson build --prefix=/usr
 ninja -C build/
 cd build/
 ninja install

1.You can’t just install *.whl without installing dependencies
Install the following dependencies in your docker

apt install python3-gi python3-dev python3-gst-1.0 python-gi-dev git \
    python3 python3-pip python3.8-dev cmake g++ build-essential libglib2.0-dev \
    libglib2.0-dev-bin libgstreamer1.0-dev libtool m4 autoconf automake libgirepository1.0-dev libcairo2-dev

2.DS-6.3 needs to be cloned according to the following command. The master now points to DS-7.0

 git clone https://github.com/NVIDIA-AI-IOT/deepstream_python_apps -b v1.1.8

Ok.
So I will reinstall the system to JP 5.1.3, DS 6.3
Then I will install

sudo apt install python3-gi python3-dev python3-gst-1.0 python-gi-dev git \
    python3 python3-pip python3.8-dev cmake g++ build-essential libglib2.0-dev \
    libglib2.0-dev-bin libgstreamer1.0-dev libtool m4 autoconf automake libgirepository1.0-dev libcairo2-dev

After this I install PYDS (Deepstream python bindings)

export PYDS_INSTALL=https://github.com/NVIDIA-AI-IOT/deepstream_python_apps/releases/download/v1.1.8/pyds-1.1.8-py3-none-linux_aarch64.whl
python3 -m pip install --no-cache $PYDS_INSTALL

Then I will clone

git clone https://github.com/NVIDIA-AI-IOT/deepstream_python_apps -b v1.1.8

To: /opt/nvidia/deepstream/deepstream-6.3/sources

Then I will run:

cd  /opt/nvidia/deepstream/deepstream-6.3/sources/deepstream_python_apps/apps/deepstream-test1
python3 deepstream_test_1.py /opt/nvidia/deepstream/deepstream-6.3/samples/streams/sample_720p.h264

Thats it? no need for library paths etc?

No need, DS-6.3 corresponds to deepstream_python_apps-v1.1.8

Hi,

I got the app to exectute doing nothing more than above.
This is what I get on the screen and no video is displayed.

python3 deepstream_test_1.py /opt/nvidia/deepstream/deepstream-6.3/samples/streams/sample_720p.h264
Creating Pipeline 
 
Creating Source 
 
Creating H264Parser 

Creating Decoder 

Creating nv3dsink 

Playing file /opt/nvidia/deepstream/deepstream-6.3/samples/streams/sample_720p.h264 
Adding elements to Pipeline 

Linking elements in the Pipeline 

Starting pipeline 

Opening in BLOCKING MODE 
0:00:02.388274182  5746     0x1e7ffe00 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::initialize() <nvdsinfer_context_impl.cpp:1174> [UID = 1]: Warning, OpenCV has been deprecated. Using NMS for clustering instead of cv::groupRectangles with topK = 20 and NMS Threshold = 0.5
WARNING: Deserialize engine failed because file path: /opt/nvidia/deepstream/deepstream-6.3/sources/deepstream_python_apps/apps/deepstream-test1/../../../../samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_int8.engine open error
0:00:05.742701901  5746     0x1e7ffe00 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1976> [UID = 1]: deserialize engine from file :/opt/nvidia/deepstream/deepstream-6.3/sources/deepstream_python_apps/apps/deepstream-test1/../../../../samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_int8.engine failed
0:00:05.904544806  5746     0x1e7ffe00 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2081> [UID = 1]: deserialize backend context from engine from file :/opt/nvidia/deepstream/deepstream-6.3/sources/deepstream_python_apps/apps/deepstream-test1/../../../../samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_int8.engine failed, try rebuild
0:00:05.904601319  5746     0x1e7ffe00 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2002> [UID = 1]: Trying to create engine from model files
WARNING: [TRT]: The implicit batch dimension mode has been deprecated. Please create the network with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag whenever possible.
WARNING: [TRT]: Unknown embedded device detected. Using 59641MiB as the allocation cap for memory on embedded devices.
WARNING: [TRT]: Unknown embedded device detected. Using 59641MiB as the allocation cap for memory on embedded devices.
WARNING: [TRT]: Unknown embedded device detected. Using 59641MiB as the allocation cap for memory on embedded devices.
WARNING: [TRT]: Unknown embedded device detected. Using 59641MiB as the allocation cap for memory on embedded devices.
WARNING: [TRT]: Unknown embedded device detected. Using 59641MiB as the allocation cap for memory on embedded devices.
WARNING: [TRT]: Unknown embedded device detected. Using 59641MiB as the allocation cap for memory on embedded devices.
WARNING: [TRT]: Unknown embedded device detected. Using 59641MiB as the allocation cap for memory on embedded devices

No more logs? Model conversion is not complete yet.

No more logs. It just flood the screen with

WARNING: [TRT]: UWARNING: [TRT]: Unknown embedded device detected. Using 59641MiB as the allocation cap for memory on embedded devices.

normally it, as you know, should build the engine file and then run inference. But I get this TRT warning.

This is a known issue for Orin 64GB board.
The message is a harmless warning and TensorRT still can work correctly.
Or you can upgrade to DS-7.0.

Hi,

i will upgrade to 7 and see if it will work. Keep you posted.
The goal is to get yolov8 to work with deepstream using GitHub - marcoslucianops/DeepStream-Yolo: NVIDIA DeepStream SDK 7.0 / 6.4 / 6.3 / 6.2 / 6.1.1 / 6.1 / 6.0.1 / 6.0 / 5.1 implementation for YOLO models

How can I move this to Deepstream forum?

Moved, seems I failed to apply the moving correctly, sorry.