Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU)
GPU
• DeepStream Version
deepstream-6.1
docker image : nvcr.io/nvidia/deepstream:6.1.1-triton
• NVIDIA GPU Driver Version (valid for GPU only)
NVIDIA-SMI 525.85.05 Driver Version: 525.85.05 CUDA Version: 12.0
GPU 0: NVIDIA GeForce RTX 2070 SUPER
• Issue Type( questions, new requirements, bugs)
https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_ref_app_deepstream.html#osd-group
When use NvOSD processing mode on GPU cause segment fault.
If use NvOSD on CPU all working fine.
When Issue happens:
nvosd = Gst.ElementFactory.make("nvdsosd", "onscreendisplay")
nvosd.set_property('process-mode',1)
Working
nvosd = Gst.ElementFactory.make("nvdsosd", "onscreendisplay")
nvosd.set_property('process-mode',0)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
I’m processing multiple videos some videos are full processed without issue, but other videos the plugin nvosd raise errors after some elapsed time, always on same time.
I already validated the videos to check for corruption and encoded the video in other formats but the same problem happens.
I’m using Triton-Server with 6 models on same host, decode, encode on GPU. Only NVOSD on GPU is causing the issue.
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)
The erros raised are not the same during executions.
First Execution
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
ERROR: infer_preprocess.cpp:563 cudaMemset2DAsync failed to set 0 to scaled padding area, cuda err_no:700, err_str:cudaErrorIllegalAddress
Unable to release host memory.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to release host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Segmentation fault (core dumped)
Second Execution
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
nvbufsurface: NvBufSurfaceSysToHWCopy: failed in mem copy
nvbufsurface: NvBufSurfaceCopy: failed to copy
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
ERROR: infer_preprocess.cpp:391 Failed to convert input-format to network format
Unable to release host memory.
ERROR: infer_preprocess.cpp:280 Failed to preprocess buffer during cudaTransform.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Segmentation fault (core dumped)
Third Execution
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
GPUassert: an illegal memory access was encountered src/modules/cuDCFv2/cuDCF.cu 778
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to copy between two host memories.
Unable to release host memory.
Unable to allocate host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to release host memory.
Unable to copy between two host memories.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to release host memory.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
ERROR: infer_preprocess.cpp:563 cudaMemset2DAsync failed to set 0 to scaled padding area, cuda err_no:700, err_str:cudaErrorIllegalAddress
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Unable to allocate host memory.
Unable to copy between two host memories.
Segmentation fault (core dumped)
Linux DMESG
[ 6132.623144] NVRM: Xid (PCI:0000:0f:00): 31, pid=145917, name=python3, Ch 00000032, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_2 faulted @ 0x7f4e_cc7e9000. Fault is of type FAULT_PTE ACCESS_TYPE_VIRT_WRITE
[ 6132.627429] SurfConv-1[145989]: segfault at 400000000 ip 00007f5011d265c2 sp 00007f4d43ffe390 error 4 in libcuda.so.525.85.05[7f5011b33000+491000]
[ 6132.627439] Code: 00 8b 44 24 18 48 81 c4 b8 00 00 00 5b 41 5c c3 0f 1f 84 00 00 00 00 00 48 c7 44 24 38 00 00 00 00 48 85 db 0f 84 a6 00 00 00 <48> 8b 3b e8 76 d3 1d 00 85 c0 75 cf 8b 43 48 85 c0 75 16 8b 53 60
[ 6132.627614] NetPreproc2[145979]: segfault at 40 ip 00007f4feec9cc1a sp 00007f4dcbffeac0 error 4 in libnvds_infer_server.so[7f4feec3b000+475000]
[ 6132.627625] Code: 8b 6a 08 0f 11 02 44 8b 26 48 8b 43 38 f3 0f 7e 43 30 48 85 c0 74 13 48 83 3d 89 bf 86 00 00 0f 84 e3 00 00 00 f0 83 40 08 01 <4d> 8b 75 40 66 48 0f 6e c8 66 0f 6c c1 41 0f 11 45 38 4d 85 f6 74
[ 6185.237163] NVRM: Xid (PCI:0000:0f:00): 31, pid=146058, name=python3, Ch 00000035, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_2 faulted @ 0x7ff4_907e9000. Fault is of type FAULT_PTE ACCESS_TYPE_VIRT_WRITE
[ 6293.066185] NVRM: Xid (PCI:0000:0f:00): 31, pid=146322, name=python3, Ch 00000032, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_6 faulted @ 0x7fbf_447e9000. Fault is of type FAULT_PTE ACCESS_TYPE_VIRT_WRITE
[ 6293.070448] NetPreproc1[146390]: segfault at 40 ip 00007fc049737c1a sp 00007fbd97ffeac0 error 4 in libnvds_infer_server.so[7fc0496d6000+475000]
[ 6293.070457] Code: 8b 6a 08 0f 11 02 44 8b 26 48 8b 43 38 f3 0f 7e 43 30 48 85 c0 74 13 48 83 3d 89 bf 86 00 00 0f 84 e3 00 00 00 f0 83 40 08 01 <4d> 8b 75 40 66 48 0f 6e c8 66 0f 6c c1 41 0f 11 45 38 4d 85 f6 74
[ 9275.190965] NVRM: Xid (PCI:0000:0f:00): 31, pid=150262, name=python3, Ch 00000036, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_6 faulted @ 0x7f10_067e9000. Fault is of type FAULT_PTE ACCESS_TYPE_VIRT_WRITE
[ 9275.195639] SurfConv-1[150332]: segfault at 40 ip 00007f115627ec1a sp 00007f0e83ffeac0 error 4 in libnvds_infer_server.so[7f115621d000+475000]
[ 9275.195651] Code: 8b 6a 08 0f 11 02 44 8b 26 48 8b 43 38 f3 0f 7e 43 30 48 85 c0 74 13 48 83 3d 89 bf 86 00 00 0f 84 e3 00 00 00 f0 83 40 08 01 <4d> 8b 75 40 66 48 0f 6e c8 66 0f 6c c1 41 0f 11 45 38 4d 85 f6 74
[10094.157628] NVRM: Xid (PCI:0000:0f:00): 31, pid=153127, name=python3, Ch 00000031, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_2 faulted @ 0x7f63_1e7e9000. Fault is of type FAULT_PTE ACCESS_TYPE_VIRT_WRITE
[10094.161843] NetPreproc1[153201]: segfault at 40 ip 00007f647321dc1a sp 00007f618fffeac0 error 4 in libnvds_infer_server.so[7f64731bc000+475000]
[10094.161859] Code: 8b 6a 08 0f 11 02 44 8b 26 48 8b 43 38 f3 0f 7e 43 30 48 85 c0 74 13 48 83 3d 89 bf 86 00 00 0f 84 e3 00 00 00 f0 83 40 08 01 <4d> 8b 75 40 66 48 0f 6e c8 66 0f 6c c1 41 0f 11 45 38 4d 85 f6 74
[10166.773101] NVRM: Xid (PCI:0000:0f:00): 31, pid=153319, name=python3, Ch 00000035, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_2 faulted @ 0x7fa6_e27e9000. Fault is of type FAULT_PTE ACCESS_TYPE_VIRT_WRITE
[10166.778003] SurfConv-1[153389]: segfault at 40 ip 00007fa828c63c1a sp 00007fa55fffeac0 error 4 in libnvds_infer_server.so[7fa828c02000+475000]
[10166.778019] Code: 8b 6a 08 0f 11 02 44 8b 26 48 8b 43 38 f3 0f 7e 43 30 48 85 c0 74 13 48 83 3d 89 bf 86 00 00 0f 84 e3 00 00 00 f0 83 40 08 01 <4d> 8b 75 40 66 48 0f 6e c8 66 0f 6c c1 41 0f 11 45 38 4d 85 f6 74
[10284.695579] NVRM: Xid (PCI:0000:0f:00): 31, pid=153649, name=python3, Ch 00000034, intr 00000000. MMU Fault: ENGINE GRAPHICS GPCCLIENT_T1_0 faulted @ 0x7fe0_767e9000. Fault is of type FAULT_PTE ACCESS_TYPE_VIRT_WRITE
[10284.700092] SurfConv-1[153719]: segfault at 40 ip 00007fe1b5b28c1a sp 00007fdef3ffeac0 error 4 in libnvds_infer_server.so[7fe1b5ac7000+475000]
[10284.700111] Code: 8b 6a 08 0f 11 02 44 8b 26 48 8b 43 38 f3 0f 7e 43 30 48 85 c0 74 13 48 83 3d 89 bf 86 00 00 0f 84 e3 00 00 00 f0 83 40 08 01 <4d> 8b 75 40 66 48 0f 6e c8 66 0f 6c c1 41 0f 11 45 38 4d 85 f6 74