deepstream for tesla in docker

hi all,

so, after sticking cuda 9, cudnn 7. tensorrt 3, videosdk 8, and opencv in a docker container and running it with nvidia-smi I am down to 1 single error when compiling the deepstream samples.

/usr/bin/ld: cannot find -lnvcuvid

Is it not possible to get this into a container? I have read from other threads this library belongs to the driver itself, but even with nvidia-docker, or mounting these libraries into the container manually, no luck.

Has anyone else been able to get past this nvcuvid issue in docker par chance? Oddly (per the link below) running locate libnvcuvid on the host does find a few libraries, when running the same command in the container, nothing is found. However, when running a find / -name libnvcuvid* all the same files are found in the container, simply not being found by locate.

Thank you in advance and please let me know if you require additional information. Cheers.

The other forums for some kind of guidance…
https://devtalk.nvidia.com/default/topic/769578/cuda-setup-and-installation/cuda-6-5-cannot-find-lnvcuvid/
https://devtalk.nvidia.com/default/topic/1032583/how-do-i-install-nvidia-video-codec-sdk-8-1-in-ubuntu-16-04-/

haha, just figured it out… its the makes/defines.inc file, was pointed to the wrong NVIDIA_DISPLAY_DRIVER_PATH :)

hopefully it helps someone else one day, cheers…

root@bc15c44fbf91:/opt/deepstream/samples# make
make[1]: Entering directory ‘/opt/deepstream/samples/decPerf’
Compiling: nvCuvidPerf.cpp
Linking: …/bin/sample_decPerf
make[1]: Leaving directory ‘/opt/deepstream/samples/decPerf’
make[1]: Entering directory ‘/opt/deepstream/samples/nvDecInfer_classification’
Compiling: nvDecInfer.cpp
Linking: …/bin/sample_classification
make[1]: Leaving directory ‘/opt/deepstream/samples/nvDecInfer_classification’
make[1]: Entering directory ‘/opt/deepstream/samples/nvDecInfer_detection’
Compiling: presenterGL.cpp
Compiling: main.cpp
Compiling: drawBbox.cu
Linking: …/bin/sample_detection
/usr/bin/ld: warning: libcudart.so.9.0, needed by …/…/lib/libdeepstream.so, may conflict with libcudart.so.8.0
make[1]: Leaving directory ‘/opt/deepstream/samples/nvDecInfer_detection’

Well, guess i spoke too soon. While it did compile, getting cuda error 100 now when trying to run decPerf :(

CUDA error 100 at line 165 in file src/nvDecLite.cpp

Which is odd, considering I can’t seem to find this nvDecLite.cpp file anywheres…

Please let me know if you have any thoughts here, thanks in advance.

root@a6abf615a42c:/opt/deepstream/samples/decPerf# ./run.sh
[DEBUG][22:39:30] Device ID: 0
[DEBUG][22:39:30] Video channels: 2
[DEBUG][22:39:30] Endless Loop: 1
[DEBUG][22:39:30] Device name: Tesla K80
[DEBUG][22:39:30] =========== Video Parameters Begin =============
[DEBUG][22:39:30] Video codec : AVC/H.264
[DEBUG][22:39:30] Frame rate : 30/1 = 30 fps
[DEBUG][22:39:30] Sequence format : Progressive
[DEBUG][22:39:30] Coded frame size: [1280, 720]
[DEBUG][22:39:30] Display area : [0, 0, 1280, 720]
[DEBUG][22:39:30] Chroma format : YUV 420
[DEBUG][22:39:30] =========== Video Parameters End =============
[ERROR][22:39:30] CUDA error 100 at line 165 in file src/nvDecLite.cpp
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.
[ERROR][22:39:30] Decoder not initialized.

over and over thousands of times

Only forum found relating:
https://devtalk.nvidia.com/default/topic/1027762/deepstream-for-tesla/cuda-error-100-amp-decoder-not-initialized-when-run-decperf-sample/

Hi,

CUDA error 100 indicates CUDA_ERROR_NO_DEVICE when calling cuvidCreateDecoder().

Which display driver do you install?
Please remember to update the driver path in the Makefile:

LIBRARIES := -L/usr/lib/nvidia-384

By the way, could you share nvidia-smi log with us?

Thanks.

nvidia-smi on the host ec2 instance:
nvidia-smi
Fri May 18 18:28:44 2018
±----------------------------------------------------------------------------+
| NVIDIA-SMI 390.48 Driver Version: 390.48 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 On | 00000000:00:1E.0 Off | 0 |
| N/A 81C P0 136W / 149W | 9730MiB / 11441MiB | 99% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 3533 C python 64MiB |
| 0 3721 C python2 112MiB |
| 0 3869 C python2 115MiB |
| 0 12002 C python2 112MiB |
| 0 15372 C python2 115MiB |
| 0 31973 C /usr/local/bin/caffe 9146MiB |
±----------------------------------------------------------------------------+

nvidia-smi via nvidia-docker2:
docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
Fri May 18 18:30:37 2018
±----------------------------------------------------------------------------+
| NVIDIA-SMI 390.48 Driver Version: 390.48 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 On | 00000000:00:1E.0 Off | 0 |
| N/A 82C P0 150W / 149W | 9730MiB / 11441MiB | 99% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
±----------------------------------------------------------------------------+

which all seems fine…

yet… when i try to run deepstream, i get the CUDA 100 error from within the container.

Also, I tried installing drivers 384, but on reboot was getting an error, so reinstalled drivers 390.

Although nvidia-smi from my container yields a bad thing, so clearly compatibility issues perhaps with 390?

nvidia-docker run --rm -it --name test test nvidia-smi
Failed to initialize NVML: Driver/library version mismatch

That is when i run it directly, but if I use bash as an entrypoint, then run nvidia-smi, it “seems” to work:
nvidia-docker run --rm -it --name test --entrypoint=/bin/bash -v /usr/lib/nvidia-390:/usr/lib/nvidia-390 test
root@4bc325d55d3e:/# nvidia-smi
Fri May 18 18:39:32 2018
±----------------------------------------------------------------------------+
| NVIDIA-SMI 390.48 Driver Version: 390.48 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 On | 00000000:00:1E.0 Off | 0 |
| N/A 81C P0 138W / 149W | 9730MiB / 11441MiB | 100% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
±----------------------------------------------------------------------------+

So, then I go into my container, make the app, and back to the missing nvcuvid, although it compiled above fine, so still missing something as usual. It either builds succesful yielding a cuda 100 error when running the sh file, or apparently cant find nvcuvid.

nvidia-docker run --rm -it --name test --entrypoint=/bin/bash deepstream:dev
root@6135622cda12:/# cd /opt/deepstream/samples/
root@6135622cda12:/opt/deepstream/samples# make
make[1]: Entering directory ‘/opt/deepstream/samples/decPerf’
Compiling: nvCuvidPerf.cpp
Linking: …/bin/sample_decPerf
/usr/bin/ld: cannot find -lnvcuvid
collect2: error: ld returned 1 exit status
Makefile.sample_decPerf:58: recipe for target ‘…/bin/sample_decPerf’ failed
make[1]: *** […/bin/sample_decPerf] Error 1
make[1]: Leaving directory ‘/opt/deepstream/samples/decPerf’
make[1]: Entering directory ‘/opt/deepstream/samples/nvDecInfer_classification’
Compiling: nvDecInfer.cpp
Linking: …/bin/sample_classification
/usr/bin/ld: cannot find -lnvcuvid
collect2: error: ld returned 1 exit status
Makefile.sample_classification:58: recipe for target ‘…/bin/sample_classification’ failed
make[1]: *** […/bin/sample_classification] Error 1
make[1]: Leaving directory ‘/opt/deepstream/samples/nvDecInfer_classification’
make[1]: Entering directory ‘/opt/deepstream/samples/nvDecInfer_detection’
Compiling: presenterGL.cpp
Compiling: main.cpp
Compiling: drawBbox.cu
Linking: …/bin/sample_detection
/usr/bin/ld: cannot find -lnvcuvid
collect2: error: ld returned 1 exit status
Makefile.sample_detection:87: recipe for target ‘…/bin/sample_detection’ failed
make[1]: *** […/bin/sample_detection] Error 1
make[1]: Leaving directory ‘/opt/deepstream/samples/nvDecInfer_detection’
Makefile:14: recipe for target ‘all’ failed
make: *** [all] Error 2

Definitely interesting…390.48 | 390.30

root@cad972b6c866:/# dmesg | grep NVRM | head -5
[ 18.163727] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 390.48 Thu Mar 22 00:42:57 PDT 2018 (using threaded interrupts)
[ 1825.934616] NVRM: API mismatch: the client has the version 390.30, but
[ 1825.934616] NVRM: this kernel module has the version 390.48. Please
[ 1825.934616] NVRM: make sure that this kernel module and all NVIDIA driver
[ 1825.934616] NVRM: components have the same version.

:)

root@cad972b6c866:/opt/deepstream/samples# vi …/makes/defines.inc
root@cad972b6c866:/opt/deepstream/samples# make
make[1]: Entering directory ‘/opt/deepstream/samples/decPerf’
Compiling: nvCuvidPerf.cpp
Linking: …/bin/sample_decPerf
/usr/bin/ld: warning: libcudart.so.9.0, needed by …/…/lib/libdeepstream.so, may conflict with libcudart.so.8.0
make[1]: Leaving directory ‘/opt/deepstream/samples/decPerf’
make[1]: Entering directory ‘/opt/deepstream/samples/nvDecInfer_classification’
Compiling: nvDecInfer.cpp
Linking: …/bin/sample_classification
make[1]: Leaving directory ‘/opt/deepstream/samples/nvDecInfer_classification’
make[1]: Entering directory ‘/opt/deepstream/samples/nvDecInfer_detection’
Compiling: presenterGL.cpp
Compiling: main.cpp
Compiling: drawBbox.cu
Linking: …/bin/sample_detection
/usr/bin/ld: warning: libcudart.so.9.0, needed by …/…/lib/libdeepstream.so, may conflict with libcudart.so.8.0
make[1]: Leaving directory ‘/opt/deepstream/samples/nvDecInfer_detection’

root@cad972b6c866:/# dpkg -l | grep “nvidia|intel”
ii libdrm-intel1:amd64 2.4.83-1~16.04.1 amd64 Userspace interface to intel-specific kernel DRM services – runtime
ii nvidia-390 390.30-0ubuntu1 amd64 NVIDIA binary driver - version 390.30
ii nvidia-390-dev 390.30-0ubuntu1 amd64 NVIDIA binary Xorg driver development files
ii nvidia-modprobe 390.30-0ubuntu1 amd64 Load the NVIDIA kernel driver and create device files
ii nvidia-opencl-icd-390 390.30-0ubuntu1 amd64 NVIDIA OpenCL ICD
ii nvidia-prime 0.8.2 amd64 Tools to enable NVIDIA’s Prime
ii nvidia-settings 390.30-0ubuntu1 amd64 Tool for configuring the NVIDIA graphics driver

root@cad972b6c866:/# nvidia-smi -a

==============NVSMI LOG==============

Timestamp : Sat May 19 00:28:09 2018
Driver Version : 390.48

Attached GPUs : 1
GPU 00000000:00:1E.0
Product Name : Tesla K80
Product Brand : Tesla
Display Mode : Disabled
Display Active : Disabled
Persistence Mode : Enabled
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A

Thanks for the direction @AstaLLL, much appreciated… the CUDA_ERROR_NO_DEVICE definitely helped me figure it out faster. Is there a list of codes I missed somewhere par chance? As I’m sure this wont be my first :)

root@cad972b6c866:/opt/deepstream/samples/decPerf# ./run.sh
[DEBUG][00:38:47] Device ID: 0
[DEBUG][00:38:47] Video channels: 2
[DEBUG][00:38:47] Endless Loop: 1
[DEBUG][00:38:47] Device name: Tesla K80
[DEBUG][00:38:49] =========== Video Parameters Begin =============
[DEBUG][00:38:49] Video codec : AVC/H.264
[DEBUG][00:38:49] Frame rate : 30/1 = 30 fps
[DEBUG][00:38:49] Sequence format : Progressive
[DEBUG][00:38:49] Coded frame size: [1280, 720]
[DEBUG][00:38:49] Display area : [0, 0, 1280, 720]
[DEBUG][00:38:49] Chroma format : YUV 420
[DEBUG][00:38:49] =========== Video Parameters End =============
[DEBUG][00:38:49] =========== Video Parameters Begin =============
[DEBUG][00:38:49] Video codec : AVC/H.264
[DEBUG][00:38:49] Frame rate : 30/1 = 30 fps
[DEBUG][00:38:49] Sequence format : Progressive
[DEBUG][00:38:49] Coded frame size: [1280, 720]
[DEBUG][00:38:49] Display area : [0, 0, 1280, 720]
[DEBUG][00:38:49] Chroma format : YUV 420
[DEBUG][00:38:49] =========== Video Parameters End =============
[DEBUG][00:38:53] Video [0]: Decode Performance: 141.42 frames/second || Decoded Frames: 500
[DEBUG][00:38:53] Video [1]: Decode Performance: 123.11 frames/second || Decoded Frames: 500
[DEBUG][00:38:56] Video [0]: Decode Performance: 129.63 frames/second || Decoded Frames: 1000

root@cad972b6c866:/opt/deepstream/samples/nvDecInfer_detection# ./run.sh
/opt/deepstream/samples/data/video /opt/deepstream/samples/nvDecInfer_detection
/opt/deepstream/samples/nvDecInfer_detection
[DEBUG][00:47:06] Device ID for display [0]: Tesla K80
[DEBUG][00:47:06] Device ID for inference [0]: Tesla K80
[DEBUG][00:47:06] Video channels: 4
[ERROR][00:47:06] Warning: No mean files.
[DEBUG][00:47:06] GUI enabled.
[DEBUG][00:47:06] Endless Loop: 0
[DEBUG][00:47:06] Device name: Tesla K80
[DEBUG][00:47:06] Use INT8 data type.
Int8 support requested on hardware without native Int8 support, performance will be negatively affected.
[DEBUG][00:47:17] =========== Network Parameters Begin ===========
[DEBUG][00:47:17] Network Input:
[DEBUG][00:47:17] >Batch :4
[DEBUG][00:47:17] >Channel :3
[DEBUG][00:47:17] >Height :368
[DEBUG][00:47:17] >Width :640
[DEBUG][00:47:17] Network Output [0]
[DEBUG][00:47:17] >Channel :4
[DEBUG][00:47:17] >Height :23
[DEBUG][00:47:17] >Width :40
[DEBUG][00:47:17] Network Output [1]
[DEBUG][00:47:17] >Channel :16
[DEBUG][00:47:17] >Height :23
[DEBUG][00:47:17] >Width :40
[DEBUG][00:47:17] =========== Network Parameters End ===========
freeglut (dummy): failed to open display ‘’ … mount x11 ;)

Hi,

Suppose there is no physical display connection on Amazon.

Please update the configuration located at [deepstream_root]/samples/nvDecInfer_detection/run.sh to turn-off display.

-gui=0

Thanks.

Thanks for the response, yes I am aware of that, was just happy that it tried to run. Thanks for the tip though.

Eager for the June 11th release, but so far I was able to get DeepStream for Tesla 1.5 running in docker as well as,
DeepStream for Jetson 1.0/1.5 running in docker, so all is well deployment wise at this time.

Thanks again for the direction, and for those that follow, pay very close attention to your local setup and the makes.inc that includes the paths… especially the driver version.

Cheers.