Description
A clear and concise description of the bug or issue.
I try to get Deepstream to run on AGX Orin. Once installed with SDK Manager and running the below described steps of packages (torch, torchvision and cloning deepstream-python-app) I get the below error when trying to run the deepstream_test1.py
Traceback (most recent call last):
File "deepstream_test_1.py", line 26, in <module>
from common.platform_info import PlatformInfo
File "/opt/nvidia/deepstream/deepstream-6.3/sources/deepstream_python_apps/apps/deepstream-test1/../common/platform_info.py", line 21, in <module>
from cuda import cudart
ModuleNotFoundError: No module named 'cuda'
Environment
TensorRT Version:
Installed from SDK Manager, Jetpack 5.1.2, DS 6.3
GPU Type:
Jetson Orin Dev Kit 64
Nvidia Driver Version:
CUDA Version:
11.4
CUDNN Version:
matching from SDK Manager
Operating System + Version:
20.04
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
2.1.0a0+41361538.nv23.06
Torchvision: 0.16.0
Baremetal or Container (if container which image + tag):
Steps To Reproduce
sudo apt install -y libgstrtspserver-1.0-0
sudo apt install -y libjansson4
sudo apt install -y python3-gst-1.0
sudo apt install -y python-gi-dev
sudo apt install -y git libglib2.0-dev
sudo apt install -y libgirepository1.0-dev
sudo apt install -y libcairo2-dev
sudo apt install -y python3-pip
sudo pip install virtualenv
export TORCH_INSTALL=https://developer.download.nvidia.com/compute/redist/jp/v512/pytorch/torch-2.1.0a0+41361538.nv23.06-cp38-cp38-linux_aarch64.whl
python3 -m pip install --no-cache $TORCH_INSTALL
sudo apt-get install -y libjpeg-dev
sudo apt-get install -y zlib1g-dev
sudo apt-get install -y libpython3-dev
sudo apt-get install -y libopenblas-dev
sudo apt-get install -y libavcodec-dev
sudo apt-get install -y libavformat-dev
sudo apt-get install -y libswscale-dev
git clone --branch v0.16.1 https://github.com/pytorch/vision torchvision
cd torchvision
export BUILD_VERSION=0.16.0 # where 0.x.0 is the torchvision version
python3 setup.py install --user
export PYDS_INSTALL=https://github.com/NVIDIA-AI-IOT/deepstream_python_apps/releases/download/v1.1.8/pyds-1.1.8-py3-none-linux_aarch64.whl
python3 -m pip install --no-cache $PYDS_INSTALL
cd /opt/nvidia/deepstream/deepstream-6.3/sources
git clone https://github.com/NVIDIA-AI-IOT/deepstream_python_apps
cd ~
python3 -c "import torchvision as tv; print(tv.__version__); import torch as t; print(t.__version__)"
python3 -c "import torch; print('Is Cuda support available: ' + str(torch.cuda.is_available()))"
mkdir Development
sudo apt install -y libcanberra-gtk-module
sudo apt install -y libcanberra-gtk3-module
sudo apt-get update
sudo apt-get install -y v4l-utils
sudo apt install -y terminator
pip3 install tqdm
gst-launch-1.0 v4l2src device=/dev/video0 ! 'video/x-raw , width=(int)1280 , height=(int)720, format=(string)GRAY8 , framerate=(fraction)60/1' ! nvvidconv ! 'video/x-raw(memory:NVMM), width=(int)1280, height=(int)720, format=I420' ! nvvidconv ! 'video/x-raw, format=BGRx' ! videoconvert ! 'video/x-raw, format=BGR, width=(int)640 , height=(int)360' ! videoconvert ! xvimagesink -e
Some of the steps are not needed but I run it to ensure that it is installed.
STEPS:
Once the installation above is complete, with PYDS 1.1.8 installation I go to the cloned deepstream_python_apps (Latest)
I run
git submodule update --init
This does not fail.
Still cant run the deepstream-test1.py. Always complains about CUDA.
./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "Orin"
CUDA Driver Version / Runtime Version 11.4 / 11.4
CUDA Capability Major/Minor version number: 8.7
Total amount of global memory: 62800 MBytes (65851035648 bytes)
(008) Multiprocessors, (128) CUDA Cores/MP: 1024 CUDA Cores
GPU Max Clock rate: 1300 MHz (1.30 GHz)
Memory Clock rate: 612 Mhz
Memory Bus Width: 256-bit
L2 Cache Size: 4194304 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total shared memory per multiprocessor: 167936 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 1536
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: Yes
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Managed Memory: Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 0 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.4, CUDA Runtime Version = 11.4, NumDevs = 1
Result = PASS
import torch
>>> torch.cuda.is_available()
True