NVOSD BUG : Unable to copy between two host memories/Unable to allocate host memory

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)
GPU
• DeepStream Version
deepstream-6.1
docker image : nvcr.io/nvidia/deepstream:6.1.1-triton
• NVIDIA GPU Driver Version (valid for GPU only)
NVIDIA-SMI 525.85.05 Driver Version: 525.85.05 CUDA Version: 12.0
GPU 0: NVIDIA GeForce RTX 2070 SUPER

• Issue Type( questions, new requirements, bugs)
https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_ref_app_deepstream.html#osd-group

When use NvOSD processing mode on GPU cause segment fault.
If use NvOSD on CPU all working fine.

When Issue happens:

 nvosd = Gst.ElementFactory.make("nvdsosd", "onscreendisplay")
 nvosd.set_property('process-mode',1)

Working

 nvosd = Gst.ElementFactory.make("nvdsosd", "onscreendisplay")
 nvosd.set_property('process-mode',0)

• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
I’m processing multiple videos some videos are full processed without issue, but other videos the plugin nvosd raise errors after some elapsed time, always on same time.
I already validated the videos to check for corruption and encoded the video in other formats but the same problem happens.

I’m using Triton-Server with 6 models on same host, decode, encode on GPU. Only NVOSD on GPU is causing the issue.

• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

The erros raised are not the same during executions.

First Execution

Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
ERROR: infer_preprocess.cpp:563 cudaMemset2DAsync failed to set 0 to scaled padding area, cuda err_no:700, err_str:cudaErrorIllegalAddress
Unable to release host memory.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to release host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Segmentation fault (core dumped)

Second Execution

Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
nvbufsurface: NvBufSurfaceSysToHWCopy: failed in mem copy
nvbufsurface: NvBufSurfaceCopy: failed to copy
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
ERROR: infer_preprocess.cpp:391 Failed to convert input-format to network format
Unable to release host memory.
ERROR: infer_preprocess.cpp:280 Failed to preprocess buffer during cudaTransform.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Segmentation fault (core dumped)

Third Execution

Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
GPUassert: an illegal memory access was encountered src/modules/cuDCFv2/cuDCF.cu 778
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to copy between two host memories.
Unable to release host memory.
Unable to allocate host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to release host memory.
Unable to copy between two host memories.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
ERROR: infer_preprocess.cpp:563 cudaMemset2DAsync failed to set 0 to scaled padding area, cuda err_no:700, err_str:cudaErrorIllegalAddress
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Segmentation fault (core dumped)

Linux DMESG

[ 6132.623144] NVRM: Xid (PCI:0000:0f:00): 31, pid=145917, name=python3, Ch 00000032, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_2 faulted @ 0x7f4e_cc7e9000. Fault is of type FAULT_PTE ACCESS_TYPE_VIRT_WRITE
[ 6132.627429] SurfConv-1[145989]: segfault at 400000000 ip 00007f5011d265c2 sp 00007f4d43ffe390 error 4 in libcuda.so.525.85.05[7f5011b33000+491000]
[ 6132.627439] Code: 00 8b 44 24 18 48 81 c4 b8 00 00 00 5b 41 5c c3 0f 1f 84 00 00 00 00 00 48 c7 44 24 38 00 00 00 00 48 85 db 0f 84 a6 00 00 00 <48> 8b 3b e8 76 d3 1d 00 85 c0 75 cf 8b 43 48 85 c0 75 16 8b 53 60
[ 6132.627614] NetPreproc2[145979]: segfault at 40 ip 00007f4feec9cc1a sp 00007f4dcbffeac0 error 4 in libnvds_infer_server.so[7f4feec3b000+475000]
[ 6132.627625] Code: 8b 6a 08 0f 11 02 44 8b 26 48 8b 43 38 f3 0f 7e 43 30 48 85 c0 74 13 48 83 3d 89 bf 86 00 00 0f 84 e3 00 00 00 f0 83 40 08 01 <4d> 8b 75 40 66 48 0f 6e c8 66 0f 6c c1 41 0f 11 45 38 4d 85 f6 74
[ 6185.237163] NVRM: Xid (PCI:0000:0f:00): 31, pid=146058, name=python3, Ch 00000035, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_2 faulted @ 0x7ff4_907e9000. Fault is of type FAULT_PTE ACCESS_TYPE_VIRT_WRITE
[ 6293.066185] NVRM: Xid (PCI:0000:0f:00): 31, pid=146322, name=python3, Ch 00000032, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_6 faulted @ 0x7fbf_447e9000. Fault is of type FAULT_PTE ACCESS_TYPE_VIRT_WRITE
[ 6293.070448] NetPreproc1[146390]: segfault at 40 ip 00007fc049737c1a sp 00007fbd97ffeac0 error 4 in libnvds_infer_server.so[7fc0496d6000+475000]
[ 6293.070457] Code: 8b 6a 08 0f 11 02 44 8b 26 48 8b 43 38 f3 0f 7e 43 30 48 85 c0 74 13 48 83 3d 89 bf 86 00 00 0f 84 e3 00 00 00 f0 83 40 08 01 <4d> 8b 75 40 66 48 0f 6e c8 66 0f 6c c1 41 0f 11 45 38 4d 85 f6 74
[ 9275.190965] NVRM: Xid (PCI:0000:0f:00): 31, pid=150262, name=python3, Ch 00000036, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_6 faulted @ 0x7f10_067e9000. Fault is of type FAULT_PTE ACCESS_TYPE_VIRT_WRITE
[ 9275.195639] SurfConv-1[150332]: segfault at 40 ip 00007f115627ec1a sp 00007f0e83ffeac0 error 4 in libnvds_infer_server.so[7f115621d000+475000]
[ 9275.195651] Code: 8b 6a 08 0f 11 02 44 8b 26 48 8b 43 38 f3 0f 7e 43 30 48 85 c0 74 13 48 83 3d 89 bf 86 00 00 0f 84 e3 00 00 00 f0 83 40 08 01 <4d> 8b 75 40 66 48 0f 6e c8 66 0f 6c c1 41 0f 11 45 38 4d 85 f6 74
[10094.157628] NVRM: Xid (PCI:0000:0f:00): 31, pid=153127, name=python3, Ch 00000031, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_2 faulted @ 0x7f63_1e7e9000. Fault is of type FAULT_PTE ACCESS_TYPE_VIRT_WRITE
[10094.161843] NetPreproc1[153201]: segfault at 40 ip 00007f647321dc1a sp 00007f618fffeac0 error 4 in libnvds_infer_server.so[7f64731bc000+475000]
[10094.161859] Code: 8b 6a 08 0f 11 02 44 8b 26 48 8b 43 38 f3 0f 7e 43 30 48 85 c0 74 13 48 83 3d 89 bf 86 00 00 0f 84 e3 00 00 00 f0 83 40 08 01 <4d> 8b 75 40 66 48 0f 6e c8 66 0f 6c c1 41 0f 11 45 38 4d 85 f6 74
[10166.773101] NVRM: Xid (PCI:0000:0f:00): 31, pid=153319, name=python3, Ch 00000035, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_2 faulted @ 0x7fa6_e27e9000. Fault is of type FAULT_PTE ACCESS_TYPE_VIRT_WRITE
[10166.778003] SurfConv-1[153389]: segfault at 40 ip 00007fa828c63c1a sp 00007fa55fffeac0 error 4 in libnvds_infer_server.so[7fa828c02000+475000]
[10166.778019] Code: 8b 6a 08 0f 11 02 44 8b 26 48 8b 43 38 f3 0f 7e 43 30 48 85 c0 74 13 48 83 3d 89 bf 86 00 00 0f 84 e3 00 00 00 f0 83 40 08 01 <4d> 8b 75 40 66 48 0f 6e c8 66 0f 6c c1 41 0f 11 45 38 4d 85 f6 74
[10284.695579] NVRM: Xid (PCI:0000:0f:00): 31, pid=153649, name=python3, Ch 00000034, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_0 faulted @ 0x7fe0_767e9000. Fault is of type FAULT_PTE ACCESS_TYPE_VIRT_WRITE
[10284.700092] SurfConv-1[153719]: segfault at 40 ip 00007fe1b5b28c1a sp 00007fdef3ffeac0 error 4 in libnvds_infer_server.so[7fe1b5ac7000+475000]
[10284.700111] Code: 8b 6a 08 0f 11 02 44 8b 26 48 8b 43 38 f3 0f 7e 43 30 48 85 c0 74 13 48 83 3d 89 bf 86 00 00 0f 84 e3 00 00 00 f0 83 40 08 01 <4d> 8b 75 40 66 48 0f 6e c8 66 0f 6c c1 41 0f 11 45 38 4d 85 f6 74

  1. could you share the whole media pipeline? which deepstream sample are you testing?
  2. could you provide simplified code based on deepstream sample to reproduce this issue?
  3. from the error log, there are some memory issues? could you check the memory usage while testing? please refer to this memory monitoring method: DeepStream SDK FAQ - #14 by mchi

Will clean the code and provide on nexts days.

Memory GPU/CPU usage

user@host:~$ nvidia-smi dmon -d 1
# gpu   pwr gtemp mtemp    sm   mem   enc   dec  mclk  pclk
# Idx     W     C     C     %     %     %     %   MHz   MHz
    0     13     45      -     0      0      0      0    405    300
    0     14     45      -     0      0      0      0    405    300
    0     14     45      -     0      0      0      0    405    300
    0     13     45      -     0      0      0      0    405    300
    0     48     47      -     0      0      0      0   6801   1605
    0    139     52      -     7      2      0      0   6801   1920
    0    129     53      -    71     17      0      2   6801   1920
    0    133     53      -    93     22      0      2   6801   1920
    0    139     54      -    93     22      0      2   6801   1920
    0    135     54      -    94     22      0      2   6801   1920
    0    145     55      -    94     22      0      2   6801   1920
    0    143     55      -    94     23      0      2   6801   1920
    0    140     54      -    94     22      0      2   6801   1920
    0    145     55      -    93     22      0      2   6801   1920
    0    140     55      -    93     22      0      2   6801   1920
    0    144     56      -    93     22      0      2   6801   1920
    0    143     56      -    94     22      0      2   6801   1920
    0    142     56      -    93     22      0      2   6801   1920
    0     59     53      -    86     20      0      2   6801   1920
    0     56     52      -     1      0      0      0   6801   1920
    0     42     51      -     0      0      0      0   6801   1605
    0     43     51      -     0      0      0      0   6801   1605
    0     44     50      -     0      0      0      0   6801   1605

user@host:~$ nvidia-smi pmon -d 1

# gpu        pid  type    sm   mem   enc   dec   command
# Idx          #   C/G     %     %     %     %   name
    0       3844     G     -     -     -     -   Xorg
    0       4126     G     0     0     -     -   gnome-shell
    0      51984     C     -     -     -     -   tritonserver
    0       3844     G     0     0     -     -   Xorg
    0       4126     G     0     0     -     -   gnome-shell
    0      51984     C     -     -     -     -   tritonserver
    0       3844     G     0     0     -     -   Xorg
    0       4126     G     -     -     -     -   gnome-shell
    0      51984     C     -     -     -     -   tritonserver
    0       3844     G     0     0     -     -   Xorg
    0       4126     G     -     -     -     -   gnome-shell
    0      51984     C     -     -     -     -   tritonserver
    0       3844     G     0     0     -     -   Xorg
    0       4126     G     -     -     -     -   gnome-shell
    0      51984     C     4     0     -     -   tritonserver
    0     272830     C     -     -     -     0   python3
    0       3844     G     -     -     -     -   Xorg
    0       4126     G     -     -     -     -   gnome-shell
    0      51984     C     4     0     -     -   tritonserver
    0     272830     C    39     9     -     0   python3
    0       3844     G     -     -     -     -   Xorg
    0       4126     G     -     -     -     -   gnome-shell
    0      51984     C     -     -     -     -   tritonserver
    0     272830     C    86    20     -     1   python3
    0       3844     G     -     -     -     -   Xorg
    0       4126     G     -     -     -     -   gnome-shell
    0      51984     C     -     -     -     -   tritonserver
    0     272830     C    93    22     -     2   python3
    0       3844     G     -     -     -     -   Xorg
    0       4126     G     -     -     -     -   gnome-shell
    0      51984     C     -     -     -     -   tritonserver
    0     272830     C    92    22     -     2   python3
    0       3844     G     -     -     -     -   Xorg
    0       4126     G     -     -     -     -   gnome-shell
    0      51984     C     -     -     -     -   tritonserver
    0     272830     C    92    22     -     2   python3
    0       3844     G     -     -     -     -   Xorg
    0       4126     G     -     -     -     -   gnome-shell
    0      51984     C    46    10     -     -   tritonserver
    0     272830     C    46    11     -     2   python3
    0       3844     G     -     -     -     -   Xorg
    0       4126     G     -     -     -     -   gnome-shell
    0      51984     C    46    10     -     -   tritonserver
    0     272830     C    46    11     -     2   python3
# gpu        pid  type    sm   mem   enc   dec   command
# Idx          #   C/G     %     %     %     %   name
    0       3844     G     -     -     -     -   Xorg
    0       4126     G     -     -     -     -   gnome-shell
    0      51984     C    46    11     -     -   tritonserver
    0     272830     C    46    11     -     1   python3
    0       3844     G     -     -     -     -   Xorg
    0       4126     G     -     -     -     -   gnome-shell
    0      51984     C    93    21     -     -   tritonserver
    0     272830     C     -     -     -     1   python3
    0       3844     G     -     -     -     -   Xorg
    0       4126     G     -     -     -     -   gnome-shell
    0      51984     C    46    10     -     -   tritonserver
    0     272830     C    46    11     -     2   python3
    0       3844     G     -     -     -     -   Xorg
    0       4126     G     -     -     -     -   gnome-shell
    0      51984     C     -     -     -     -   tritonserver
    0     272830     C    93    21     -     2   python3
    0       3844     G     -     -     -     -   Xorg
    0       4126     G     -     -     -     -   gnome-shell
    0      51984     C     -     -     -     -   tritonserver
    0     272830     C    94    22     -     1   python3
    0       3844     G     -     -     -     -   Xorg
    0       4126     G     -     -     -     -   gnome-shell
    0      51984     C     -     -     -     -   tritonserver
    0     272830     C    93    21     -     2   python3
    0       3844     G     -     -     -     -   Xorg
    0       4126     G     -     -     -     -   gnome-shell
    0      51984     C     -     -     -     -   tritonserver
    0       3844     G     -     -     -     -   Xorg
    0       4126     G     -     -     -     -   gnome-shell
    0      51984     C     -     -     -     -   tritonserver
    0       3844     G     -     -     -     -   Xorg
    0       4126     G     -     -     -     -   gnome-shell
    0      51984     C     -     -     -     -   tritonserver
    0       3844     G     -     -     -     -   Xorg
    0       4126     G     -     -     -     -   gnome-shell
    0      51984     C     -     -     -     -   tritonserver

Host RAM

root@host # free -h
              total        used        free      shared  buff/cache   available
Mem:           15Gi       3.4Gi       2.3Gi       278Mi       9.9Gi        11Gi
Swap:          19Gi          0B        19Gi
root@host# python3 nvmemstat.py  -p app.py
PID: 270827   14:24:24  Total used hardware memory: 0.0000 KiB  hardware memory: 0.0000 KiB             VmSize: 626.2617 MiB    VmRSS: 63.9648 MiB      RssFile: 27.5273 MiB    RssAnon: 36.4375 MiB    lsof: 104
PID: 270827   14:24:25  Total used hardware memory: 0.0000 KiB  hardware memory: 0.0000 KiB             VmSize: 18701.3555 MiB  VmRSS: 620.7617 MiB     RssFile: 251.4609 MiB   RssAnon: 334.9219 MiB   lsof: 332
PID: 270827   14:24:26  Total used hardware memory: 0.0000 KiB  hardware memory: 0.0000 KiB             VmSize: 18964.2070 MiB  VmRSS: 905.1953 MiB     RssFile: 251.8438 MiB   RssAnon: 617.1445 MiB   lsof: 333
PID: 270827   14:24:27  Total used hardware memory: 0.0000 KiB  hardware memory: 0.0000 KiB             VmSize: 19227.0586 MiB  VmRSS: 1453.5742 MiB    RssFile: 252.0312 MiB   RssAnon: 1165.1641 MiB  lsof: 333
PID: 270827   14:24:28  Total used hardware memory: 0.0000 KiB  hardware memory: 0.0000 KiB             VmSize: 19296.7852 MiB  VmRSS: 1573.6328 MiB    RssFile: 252.0312 MiB   RssAnon: 1285.2227 MiB  lsof: 333
PID: 270827   14:24:29  Total used hardware memory: 0.0000 KiB  hardware memory: 0.0000 KiB             VmSize: 19297.0352 MiB  VmRSS: 1593.8594 MiB    RssFile: 252.0312 MiB   RssAnon: 1305.4492 MiB  lsof: 333
PID: 270827   14:24:30  Total used hardware memory: 0.0000 KiB  hardware memory: 0.0000 KiB             VmSize: 19297.0352 MiB  VmRSS: 1599.6523 MiB    RssFile: 252.0312 MiB   RssAnon: 1311.2422 MiB  lsof: 333
PID: 270827   14:24:31  Total used hardware memory: 0.0000 KiB  hardware memory: 0.0000 KiB             VmSize: 19297.0352 MiB  VmRSS: 1606.4258 MiB    RssFile: 252.0312 MiB   RssAnon: 1318.0156 MiB  lsof: 333
PID: 270827   14:24:33  Total used hardware memory: 0.0000 KiB  hardware memory: 0.0000 KiB             VmSize: 19297.0352 MiB  VmRSS: 1607.7109 MiB    RssFile: 252.0312 MiB   RssAnon: 1319.3008 MiB  lsof: 333
PID: 270827   14:24:34  Total used hardware memory: 0.0000 KiB  hardware memory: 0.0000 KiB             VmSize: 19297.0352 MiB  VmRSS: 1608.2266 MiB    RssFile: 252.0312 MiB   RssAnon: 1319.8164 MiB  lsof: 333
PID: 270827   14:24:35  Total used hardware memory: 0.0000 KiB  hardware memory: 0.0000 KiB             VmSize: 19361.0352 MiB  VmRSS: 1608.8359 MiB    RssFile: 252.1406 MiB   RssAnon: 1320.3164 MiB  lsof: 333
PID: 270827   14:24:36  Total used hardware memory: 0.0000 KiB  hardware memory: 0.0000 KiB             VmSize: 19361.0352 MiB  VmRSS: 1609.8633 MiB    RssFile: 252.1406 MiB   RssAnon: 1321.3438 MiB  lsof: 333
PID: 270827   14:24:37  Total used hardware memory: 0.0000 KiB  hardware memory: 0.0000 KiB             VmSize: 19361.0352 MiB  VmRSS: 1610.1211 MiB    RssFile: 252.1406 MiB   RssAnon: 1321.6016 MiB  lsof: 333
PID: 270827   14:24:38  Total used hardware memory: 0.0000 KiB  hardware memory: 0.0000 KiB             VmSize: 19361.0352 MiB  VmRSS: 1610.6367 MiB    RssFile: 252.1406 MiB   RssAnon: 1322.1172 MiB  lsof: 333
  1. why is CUDA version 12.0? please check if all lib versions meet DeepStream 6.1’s demand. please refer to this link: Quickstart Guide — DeepStream documentation 6.4 documentation
    here are the commands:
    CUDA version nvcc -V
    dpkg -l |grep TensorRT
    dpkg -l | grep gstreamer
    dpkg -l|grep cudnn

  2. if all lib versions meet demand, you can deepstream-infer-tensor-meta-test to test nvinferserver + nvosd 's GPU mode.

CUDA 12 is from nvidia-smi (supported cuda). ( I just copy paste output from nvidia-smi).
I’m using docker image I didn’t change nvidia image.

I cleanup the code and found the root cause of issue was the caps filter was miss configured. We had to change caps filter due encode on CPU

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.