• Hardware Platform (Jetson / GPU)
Ubuntu 18.04 Virtual machine on Azure with a Tesla T4 chip (there is no display, just SSH).
• DeepStream Version
Deepstream 6.0, Triton Docker container, extended to have deepstream python bindings installed. My docker file is:
FROM nvcr.io/nvidia/deepstream:6.0-triton # To get video driver libraries at runtime (libnvidia-encode.so/libnvcuvid.so) ENV NVIDIA_DRIVER_CAPABILITIES $NVIDIA_DRIVER_CAPABILITIES,video # Include our deepstream python app in the image's filesystem COPY . /opt/nvidia/deepstream/deepstream-6.0/sources/mypythonapp # Get Deepstream Python bindings built (from the Deepstream sample Python app repo) RUN apt update && apt install -y python-dev python3 python3-pip python3.8-dev cmake \ g++ build-essential libglib2.0-dev libglib2.0-dev-bin python-gi-dev libtool m4 autoconf automake WORKDIR /opt/nvidia/deepstream/deepstream-6.0/sources RUN git clone https://github.com/NVIDIA-AI-IOT/deepstream_python_apps.git WORKDIR deepstream_python_apps RUN git submodule update --init WORKDIR 3rdparty/gst-python/ RUN git config --global http.sslverify false && ./autogen.sh && make && make install WORKDIR ../../bindings RUN mkdir build WORKDIR build RUN cmake .. -DPYTHON_MAJOR_VERSION=3 -DPYTHON_MINOR_VERSION=8 RUN make RUN pip3 install ./pyds-1.1.0-py3-none*.whl
Notice I avoided python 3.6. My first time around setting up the bindings resulted in a python version mismatch between the gstreamer bindings and the deepstream bindings. Is this is an issue? (it isn’t for Ubuntu 20.04, according to the bindings set up instructions). I resorted to using the docker container after the other set up options didn’t work (more fun details about that below)
• TensorRT Version 8.0.1
• NVIDIA GPU Driver Version (valid for GPU only)
The VM comes with 495.29.05 installed, and I first tried that. Configs wouldn’t run, so I tried docker containers, where I could create nvidia plugin elements but they wouldn’t do anything. I switched to 470.63.01, but stuff was still broken, so I also reinstalled CUDA (which seemed to automatically upgrade the driver to 510.47.03 and Tensor RT and the Deepstream SDK, rebooted the machine, and things worked even less after that. The NVIDIA plugins that were previously being created fine are not being created. When I run the provided configs, I get:
sudo deepstream-app -c config_infer_primary.txt nvbufsurftransform:cuInit failed : 100 nvbufsurftransform:cuInit failed : 100 nvbufsurftransform:cuInit failed : 100 ** ERROR: <create_multi_source_bin:1423>: Failed to create element 'src_bin_muxer ** ERROR: <create_multi_source_bin:1516>: create_multi_source_bin failed ** ERROR: <create_pipeline:1323>: create_pipeline failed ** ERROR: <main:639>: Failed to create pipeline Quitting App run failed
When I try to gst-launch a simple pipeline with a decoder in it, I get:
ERROR: from element /GstPipeline:pipeline0/nvv4l2decoder:nvv4l2decoder0: Could not open device '/dev/nvidia0' for reading and writing.
I tried the whole process a few more times after manually removing the NVIDIA kernel modules, chose different DKMS registration and 32-bit compatibility options in the driver installer, and specified the version
11.4.1 in the
apt-get command for
cuda. It still switched out my 470.63.01 driver for 510.47.03, and also dumps an uninstall log when removing the 470.63.01 driver:
nvidia-installer log file '/var/log/nvidia-uninstall.log' creation time: Fri Feb 11 21:21:16 2022 installer version: 470.63.01 PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin nvidia-installer command line: /usr/bin/nvidia-uninstall -s Using built-in stream user interface -> Detected 4 CPUs online; setting concurrency level to 4. -> If you plan to no longer use the NVIDIA driver, you should make sure that no X screens are configured to use the NVIDIA X driver in your X configuration file. If you used nvidia-xconfig to configure X, it may have created a backup of your original configuration. Would you like to run `nvidia-xconfig --restore-original-backup` to attempt restoration of the original X configuration file? (Answer: No) -> Parsing log file: -> done. -> Validating previous installation: -> The installed file '/usr/bin/nvidia-cuda-mps-control' seems to have changed, but `prelink -u` failed; unable to restore '/usr/bin/nvidia-cuda-mps-control' to an un-prelinked state. -> The installed file '/usr/bin/nvidia-cuda-mps-server' seems to have changed, but `prelink -u` failed; unable to restore '/usr/bin/nvidia-cuda-mps-server' to an un-prelinked state. -> The installed file '/usr/share/man/man1/nvidia-cuda-mps-control.1.gz' has a different checksum (2070222035l) than when it was installed (1240023249l). -> The installed file '/usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so.4.0.0' seems to have changed, but `prelink -u` failed; unable to restore '/usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so.4.0.0' to an un-prelinked state. -> The installed file '/usr/share/man/man1/nvidia-persistenced.1.gz' has a different checksum (3630539026l) than when it was installed (325814339l). -> The installed file '/usr/bin/nvidia-persistenced' seems to have changed, but `prelink -u` failed; unable to restore '/usr/bin/nvidia-persistenced' to an un-prelinked state. -> The previously installed symlink '/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1' has target 'libnvidia-ml.so.510.47.03', but it was installed with target 'libnvidia-ml.so.470.63.01'. /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1 will not be uninstalled. -> The previously installed symlink '/usr/lib/x86_64-linux-gnu/libcuda.so.1' has target 'libcuda.so.510.47.03', but it was installed with target 'libcuda.so.470.63.01'. /usr/lib/x86_64-linux-gnu/libcuda.so.1 will not be uninstalled. -> The previously installed symlink '/usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1' has target 'libnvidia-opencl.so.510.47.03', but it was installed with target 'libnvidia-opencl.so.470.63.01'. /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1 will not be uninstalled. -> The previously installed symlink '/usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.1' has target 'libnvidia-ptxjitcompiler.so.510.47.03', but it was installed with target 'libnvidia-ptxjitcompiler.so.470.63.01'. /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.1 will not be uninstalled. -> The previously installed symlink '/usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0' has target 'libGLX_nvidia.so.510.47.03', but it was installed with target 'libGLX_nvidia.so.470.63.01'. /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0 will not be uninstalled. -> The previously installed symlink '/usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.0' has target 'libEGL_nvidia.so.510.47.03', but it was installed with target 'libEGL_nvidia.so.470.63.01'. /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.0 will not be uninstalled. -> The previously installed symlink '/usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.2' has target 'libGLESv2_nvidia.so.510.47.03', but it was installed with target 'libGLESv2_nvidia.so.470.63.01'. /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.2 will not be uninstalled. -> The previously installed symlink '/usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.1' has target 'libGLESv1_CM_nvidia.so.510.47.03', but it was installed with target 'libGLESv1_CM_nvidia.so.470.63.01'. /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.1 will not be uninstalled. -> The previously installed symlink '/usr/lib/x86_64-linux-gnu/vdpau/libvdpau_nvidia.so.1' has target 'libvdpau_nvidia.so.510.47.03', but it was installed with target 'libvdpau_nvidia.so.470.63.01'. /usr/lib/x86_64-linux-gnu/vdpau/libvdpau_nvidia.so.1 will not be uninstalled. -> The previously installed symlink '/usr/lib/x86_64-linux-gnu/libnvoptix.so.1' has target 'libnvoptix.so.510.47.03', but it was installed with target 'libnvoptix.so.470.63.01'. /usr/lib/x86_64-linux-gnu/libnvoptix.so.1 will not be uninstalled. -> The previously installed symlink '/usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.1' has target 'libnvidia-fbc.so.510.47.03', but it was installed with target 'libnvidia-fbc.so.470.63.01'. /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.1 will not be uninstalled. -> The previously installed symlink '/usr/lib/x86_64-linux-gnu/libnvcuvid.so.1' has target 'libnvcuvid.so.510.47.03', but it was installed with target 'libnvcuvid.so.470.63.01'. /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1 will not be uninstalled. -> The previously installed symlink '/usr/lib/x86_64-linux-gnu/libnvidia-encode.so.1' has target 'libnvidia-encode.so.510.47.03', but it was installed with target 'libnvidia-encode.so.470.63.01'. /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.1 will not be uninstalled. -> The previously installed symlink '/usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.1' has target 'libnvidia-opticalflow.so.510.47.03', but it was installed with target 'libnvidia-opticalflow.so.470.63.01'. /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.1 will not be uninstalled. -> done. WARNING: Your driver installation has been altered since it was initially installed; this may happen, for example, if you have since installed the NVIDIA driver through a mechanism other than nvidia-installer (such as your distribution's native package management system). nvidia-installer will attempt to uninstall as best it can. Please see the file '/var/log/nvidia-uninstall.log' for details. -> Uninstalling NVIDIA Accelerated Graphics Driver for Linux-x86_64 (1.0-4706301 (470.63.01)): -> DKMS module detected; removing... -> Unable to restore symbolic link /usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.1 -> libnvidia-ngx.so.495.29.05 (File exists). -> Failed to delete the directory '/lib/firmware/nvidia' (Directory not empty). -> Failed to delete the directory '/etc/OpenCL/vendors' (Directory not empty). -> Failed to delete the directory '/usr/share/nvidia' (Directory not empty). -> Failed to delete the directory '/usr/lib/nvidia' (Directory not empty).
Those attempts changed nothing :D
After that, I tried removing the 510 version of the drivers and installing the recommended 470 version. I noticed this time around
librdkafka wasn’t building right before, which could have been causing some issues. Now I’m back to where I was originally, which is behavior like:
/opt/nvidia/deepstream/deepstream-6.0/samples/configs/deepstream-app$ sudo gst-launch-1.0 videotestsrc is-live=1 ! x264enc ! nvv4l2decoder ! fakesink Setting pipeline to PAUSED ... Pipeline is live and does not need PREROLL ... Setting pipeline to PLAYING ... New clock: GstSystemClock Redistribute latency... terminate called after throwing an instance of 'NVDECException' what(): cuvidv4l2_handle_video_sequence_cb : Codec not supported on this GPU at src/cuvidv4l2_nvdec.cpp:223 Aborted /opt/nvidia/deepstream/deepstream-6.0/samples/configs/deepstream-app$ sudo gst-launch-1.0 videotestsrc is-live=1 ! x264enc ! video/h264(memory:NVMM) ! nvv4l2decoder ! fakesink -bash: syntax error near unexpected token `(' /opt/nvidia/deepstream/deepstream-6.0/samples/configs/deepstream-app$ sudo deepstream-app -c config_infer_primary.txt ** ERROR: <main:658>: Failed to set pipeline to PAUSED Quitting ERROR from src_bin_muxer: Output width not set Debug info: gstnvstreammux.c(2779): gst_nvstreammux_change_state (): /GstPipeline:pipeline/GstBin:multi_src_bin/GstNvStreamMux:src_bin_muxer App run failed
After this, I uninstalled the drivers again and reinstalled via
apt-get instead of the
.run file (closest version available is
470.103.01, and now I seem to be able to decode on the command line! However, the config files still fail in a similar way, and encoding from the command line gives me this:
sudo gst-launch-1.0 videotestsrc ! nvvideoconvert ! capsfilter caps="video/x-raw(memory:NVMM),format=(string)I420" ! nvv4l2h264enc ! fakesink Setting pipeline to PAUSED ... Pipeline is PREROLLING ... Redistribute latency... Redistribute latency... ERROR: from element /GstPipeline:pipeline0/nvv4l2h264enc:nvv4l2h264enc0: Could not get/set settings from/on resource. Additional debug info: gstv4l2object.c(3501): gst_v4l2_object_set_format_full (): /GstPipeline:pipeline0/nvv4l2h264enc:nvv4l2h264enc0: Device is in streaming mode ERROR: pipeline doesn't want to preroll. Setting pipeline to NULL ... Freeing pipeline ...
• Issue Type( questions, new requirements, bugs)
How do I set up deepstream good.
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)