Cuda failure in Deepstream docker on Centos 7

Hi,

Lately we are testing different hardware-accelerate video-analytic solutions, and Deepstream seems promising.
We would like to try it out on our MEC device, that has 2 Quadro RTX 8000 and NVIDIA GRID installed in it.
The host OS is Centos7.

Nvidia version info:
[root@198 ~]# nvidia-smi
Fri May 29 13:38:56 2020
±----------------------------------------------------------------------------+
| NVIDIA-SMI 430.46 Driver Version: 430.46 CUDA Version: N/A |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro RTX 8000 On | 00000000:37:00.0 Off | Off |
| 33% 40C P8 32W / 260W | 48984MiB / 49151MiB | 0% Default |
±------------------------------±---------------------±---------------------+
| 1 Quadro RTX 8000 On | 00000000:86:00.0 Off | Off |
| 33% 39C P8 36W / 260W | 48984MiB / 49151MiB | 0% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 42351 C+G vgpu 4056MiB |
| 0 42997 C+G vgpu 4056MiB |
| 0 43192 C+G vgpu 4056MiB |
| 0 43255 C+G vgpu 4056MiB |
| 0 43612 C+G vgpu 4056MiB |
| 0 43982 C+G vgpu 4056MiB |
| 0 44081 C+G vgpu 4056MiB |
| 0 44138 C+G vgpu 4056MiB |
| 0 44535 C+G vgpu 4056MiB |
| 0 44790 C+G vgpu 4056MiB |
| 0 44905 C+G vgpu 4056MiB |
| 0 45553 C+G vgpu 4056MiB |
| 1 45039 C+G vgpu 3042MiB |
| 1 45698 C+G vgpu 3042MiB |
| 1 45786 C+G vgpu 3042MiB |
| 1 45934 C+G vgpu 3042MiB |
| 1 46442 C+G vgpu 3042MiB |
| 1 46581 C+G vgpu 3042MiB |
| 1 46712 C+G vgpu 3042MiB |
| 1 46792 C+G vgpu 3042MiB |
| 1 47388 C+G vgpu 3042MiB |
| 1 47505 C+G vgpu 3042MiB |
| 1 47693 C+G vgpu 3042MiB |
| 1 47743 C+G vgpu 3042MiB |
| 1 48318 C+G vgpu 3042MiB |
| 1 48461 C+G vgpu 3042MiB |
| 1 48652 C+G vgpu 3042MiB |
| 1 48727 C+G vgpu 3042MiB |
±----------------------------------------------------------------------------+

[root@198 ~]# modinfo nvidia
filename: /lib/modules/3.10.0-1062.9.1.el7.x86_64/weak-updates/nvidia/nvidia.ko
alias: char-major-195-*
version: 430.46
supported: external
license: NVIDIA
retpoline: Y
rhelversion: 7.7
srcversion: 922226EAFE970320108DB9A
alias: pci:v000010DEd00000E00svsdbc04sc80i00*
alias: pci:v000010DEdsvsdbc03sc02i00
alias: pci:v000010DEdsvsdbc03sc00i00
depends: ipmi_msghandler
vermagic: 3.10.0-1057.el7.x86_64 SMP mod_unload modversions
parm: NvSwitchRegDwords:NvSwitch regkey (charp)
parm: NVreg_Mobile:int
parm: NVreg_ResmanDebugLevel:int
parm: NVreg_RmLogonRC:int
parm: NVreg_ModifyDeviceFiles:int
parm: NVreg_DeviceFileUID:int
parm: NVreg_DeviceFileGID:int
parm: NVreg_DeviceFileMode:int
parm: NVreg_InitializeSystemMemoryAllocations:int
parm: NVreg_UsePageAttributeTable:int
parm: NVreg_MapRegistersEarly:int
parm: NVreg_RegisterForACPIEvents:int
parm: NVreg_EnablePCIeGen3:int
parm: NVreg_EnableMSI:int
parm: NVreg_TCEBypassMode:int
parm: NVreg_EnableStreamMemOPs:int
parm: NVreg_EnableBacklightHandler:int
parm: NVreg_RestrictProfilingToAdminUsers:int
parm: NVreg_PreserveVideoMemoryAllocations:int
parm: NVreg_DynamicPowerManagement:int
parm: NVreg_EnableUserNUMAManagement:int
parm: NVreg_MemoryPoolSize:int
parm: NVreg_KMallocHeapMaxSize:int
parm: NVreg_VMallocHeapMaxSize:int
parm: NVreg_IgnoreMMIOCheck:int
parm: NVreg_NvLinkDisable:int
parm: NVreg_RegistryDwords:charp
parm: NVreg_RegistryDwordsPerDevice:charp
parm: NVreg_RmMsg:charp
parm: NVreg_GpuBlacklist:charp
parm: NVreg_TemporaryFilePath:charp
parm: NVreg_AssignGpus:charp

As we are not planning to update our nvidia driver for now, and Deepstream is not supported on Centos yet, we tried out dockerized version: nvcr.io/nvidia/deepstream:4.0.2-19.12-devel.

We start it with the following command:
[centos@hp-gpu-node1 ~]$ docker run --gpus all -it --rm -v /tmp/.X11-unix:/tmp/.X11-unix --env=“DISPLAY” --net=host -e DISPLAY=$DISPLAY -w /opt/nvidia/deepstream/deepstream-4.0 --volume="$HOME/.Xauthority:/root/.Xauthority:rw" nvcr.io/nvidia/deepstream:4.0.2-19.12-devel

And try to run the deepstream-test1-app, but get following error:

root@hp-gpu-node1:/opt/nvidia/deepstream/deepstream-4.0# cd ~/deepstream_sdk_v4.0.2_x86_64/sources/apps/sample_apps/deepstream-test1/
root@hp-gpu-node1:~/deepstream_sdk_v4.0.2_x86_64/sources/apps/sample_apps/deepstream-test1# deepstream-test1-app …/…/…/…/samples/streams/sample_720p.h264
Now playing: …/…/…/…/samples/streams/sample_720p.h264
libEGL warning: DRI3: failed to query the version
libEGL warning: DRI2: failed to authenticate
Creating LL OSD context new
0:00:08.188922891 10 0x557358461430 INFO nvinfer gstnvinfer.cpp:519:gst_nvinfer_logger: NvDsInferContext[UID 1]:initialize(): Trying to create engine from model files
0:00:18.115738173 10 0x557358461430 INFO nvinfer gstnvinfer.cpp:519:gst_nvinfer_logger: NvDsInferContext[UID 1]:generateTRTModel(): Storing the serialized cuda engine to file at /root/deepstream_sdk_v4.0.2_x86_64/samples/models/Primary_Detector/resnet10.caffemodel_b1_int8.engine
Running…
Cuda failure: status=801

Could you please help what could be the problem?

It could be that your driver version is too low 430.46 It’s probably what’s in the package repositories, but it’s not new enough so you have to install it using Nvidia’s instructions for your distro.

I haven’t used used CentOS in a while but this is the case on Ubuntu, where an extra apt repo must be added (or the driver built using the .run file, but that will break on a kernel update). If you have secure boot enabled, i’d recommended DKMS package (if it exists for CentOS). At least on Ubuntu this is the easiest way to configure automatic module signing. Without this you’ll have to manually sign the kernel module on every update (or disable secure boot, which is what most people do, which is bad).

Thanks for the tip.
According to this table Deepstream SDK 4.0.2 was supported from R418+ :


So I think the driver version should be OK for this deepstream version.

Can you please point out that missing apt repo/ a proper guidance to install the docker version on CentOS?

I suspect, that some host vs. docker CUDA linkage is missing, but I can’t find out the exact solution.

Sorry. I didn’t realize you were at version 4. That’s my fault for not reading closely enough. I think somebody from Nvidia will have to answer your question in this case since I’m not sure what the issue is.

Thanks, @mdegans!

Hi @akos.peter.szabo,
could you try below change , build and run again with “**./**deepstream-test1-app …/…/…/…/samples/streams/sample_720p.h264” ?

--- a/deepstream_test1_app.c
+++ b/deepstream_test1_app.c
@@ -203,7 +203,8 @@ main (int argc, char *argv[])
 #ifdef PLATFORM_TEGRA
   transform = gst_element_factory_make ("nvegltransform", "nvegl-transform");
 #endif
-  sink = gst_element_factory_make ("nveglglessink", "nvvideo-renderer");
+  //sink = gst_element_factory_make ("nveglglessink", "nvvideo-renderer");
+  sink = gst_element_factory_make ("fakesink", "nvvideo-renderer");

   if (!source || !h264parser || !decoder || !pgie
       || !nvvidconv || !nvosd || !sink) {

And, could you try to cuda-gdb to capture the backtrace

cuda-gdb ./deepstream-test1-app …/…/…/…/samples/streams/sample_720p.h264

after crash, input “bt” to get backtrace.

Thanks!

Hi @mchi,

I tried the modified application, but CUDA still shows failure.

I also tried to run it with cuda-gdb, but the problem is the application is not crashing, it only raises “Cuda failure: status=801”.
So I couldn’t get the backtrace.

Thanks,
Akos

could you try all below three command ?

$ gst-launch-1.0 filesrc location=…/…/…/…/samples/streams/sample_720p.h264 ! h264parse ! nvv4l2decoder ! m.sink_0 nvstreammux name=m width=1920 height=1080 batch-size=1 batched-push-timeout=4000000 ! nvinfer config-file-path=dstest1_pgie_config.txt ! nvvideoconvert ! nvdsosd ! fakesink

$ gst-launch-1.0 filesrc location=…/…/…/…/samples/streams/sample_720p.h264 ! h264parse ! nvv4l2decoder ! m.sink_0 nvstreammux name=m width=1920 height=1080 batch-size=1 batched-push-timeout=4000000 ! nvinfer config-file-path=dstest1_pgie_config.txt ! nvvideoconvert ! fakesink

$ gst-launch-1.0 filesrc location=…/…/…/…/samples/streams/sample_720p.h264 ! h264parse ! nvv4l2decoder ! m.sink_0 nvstreammux name=m width=1920 height=1080 batch-size=1 batched-push-timeout=4000000 ! nvinfer config-file-path=dstest1_pgie_config.txt ! fakesink

Hi mchi,

I tried these three commands, all provide the same error. (Cuda failure: status=801)

root@hp-gpu-node1:~/deepstream_sdk_v4.0.2_x86_64/sources/apps/sample_apps/deepstream-test1# gst-launch-1.0 filesrc location=…/…/…/…/samples/streams/sample_720p.h264 ! h264parse ! nvv4l2decoder ! m.sink_0 nvstreammux name=m width=1920 height=1080 batch-size=1 batched-push-timeout=4000000 ! nvinfer config-file-path=dstest1_pgie_config.txt ! nvvideoconvert ! nvdsosd ! fakesink
Setting pipeline to PAUSED …
Creating LL OSD context new
0:00:00.804678715 617 0x556678d4b550 INFO nvinfer gstnvinfer.cpp:519:gst_nvinfer_logger: NvDsInferContext[UID 1]:initialize(): Trying to create engine from model files
0:00:10.661127189 617 0x556678d4b550 INFO nvinfer gstnvinfer.cpp:519:gst_nvinfer_logger: NvDsInferContext[UID 1]:generateTRTModel(): Storing the serialized cuda engine to file at /root/deepstream_sdk_v4.0.2_x86_64/samples/models/Primary_Detector/resnet10.caffemodel_b1_int8.engine
Pipeline is PREROLLING …
Cuda failure: status=801
^Chandling interrupt.
Interrupt: Stopping pipeline …
ERROR: pipeline doesn’t want to preroll.
Setting pipeline to NULL …

^C
root@hp-gpu-node1:~/deepstream_sdk_v4.0.2_x86_64/sources/apps/sample_apps/deepstream-test1#
root@hp-gpu-node1:~/deepstream_sdk_v4.0.2_x86_64/sources/apps/sample_apps/deepstream-test1# gst-launch-1.0 filesrc location=…/…/…/…/samples/streams/sample_720p.h264 ! h264parse ! nvv4l2decoder ! m.sink_0 nvstreammux name=m width=1920 height=1080 batch-size=1 batched-push-timeout=4000000 ! nvinfer config-file-path=dstest1_pgie_config.txt ! nvvideoconvert ! fakesink
Setting pipeline to PAUSED …
0:00:00.815580425 627 0x5653391b7920 INFO nvinfer gstnvinfer.cpp:519:gst_nvinfer_logger: NvDsInferContext[UID 1]:initialize(): Trying to create engine from model files
0:00:10.758174525 627 0x5653391b7920 INFO nvinfer gstnvinfer.cpp:519:gst_nvinfer_logger: NvDsInferContext[UID 1]:generateTRTModel(): Storing the serialized cuda engine to file at /root/deepstream_sdk_v4.0.2_x86_64/samples/models/Primary_Detector/resnet10.caffemodel_b1_int8.engine
Pipeline is PREROLLING …
Cuda failure: status=801
^Chandling interrupt.
Interrupt: Stopping pipeline …
ERROR: pipeline doesn’t want to preroll.
Setting pipeline to NULL …
^C
root@hp-gpu-node1:~/deepstream_sdk_v4.0.2_x86_64/sources/apps/sample_apps/deepstream-test1#
root@hp-gpu-node1:~/deepstream_sdk_v4.0.2_x86_64/sources/apps/sample_apps/deepstream-test1#
root@hp-gpu-node1:~/deepstream_sdk_v4.0.2_x86_64/sources/apps/sample_apps/deepstream-test1#
root@hp-gpu-node1:~/deepstream_sdk_v4.0.2_x86_64/sources/apps/sample_apps/deepstream-test1# gst-launch-1.0 filesrc location=…/…/…/…/samples/streams/sample_720p.h264 ! h264parse ! nvv4l2decoder ! m.sink_0 nvstreammux name=m width=1920 height=1080 batch-size=1 batched-push-timeout=4000000 ! nvinfer config-file-path=dstest1_pgie_config.txt ! fakesink
Setting pipeline to PAUSED …
0:00:00.792476123 637 0x55b01adc3a60 INFO nvinfer gstnvinfer.cpp:519:gst_nvinfer_logger: NvDsInferContext[UID 1]:initialize(): Trying to create engine from model files
0:00:10.393224037 637 0x55b01adc3a60 INFO nvinfer gstnvinfer.cpp:519:gst_nvinfer_logger: NvDsInferContext[UID 1]:generateTRTModel(): Storing the serialized cuda engine to file at /root/deepstream_sdk_v4.0.2_x86_64/samples/models/Primary_Detector/resnet10.caffemodel_b1_int8.engine
Pipeline is PREROLLING …
Cuda failure: status=801
^Chandling interrupt.
Interrupt: Stopping pipeline …
ERROR: pipeline doesn’t want to preroll.
Setting pipeline to NULL …
^C
root@hp-gpu-node1:~/deepstream_sdk_v4.0.2_x86_64/sources/apps/sample_apps/deepstream-test1#

Can you run any CUDA sample and TensorRT sample on your system?

Thanks!

Hi

Is this still an issue to support? Any result/status can be shared?

Thanks

Hi all,

Currently we put this topic on hold.
You can close this forum for now.
We will reopen it when needed.

Thanks,
Akos