Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU)
T4 GPU
• DeepStream Version
6.1.1
• TensorRT Version
TensorRT 8.4.1-1 (same as docker image deepstream:6.1.1-base)
• NVIDIA GPU Driver Version (valid for GPU only)
525.85.12
• Issue Type( questions, new requirements, bugs)
questions
• Question
We’ve been using DeepStream 6.1.1 on servers with 3~4 T4 GPU cards. The hosts Nvidia Drvier has been updated to a quite new version 525.85.12.
Recently we encountered a core dump issue, which happens randomly. Belowing is the error message:
nvbuf_utils: dmabuf_fd -1 mapped entry NOT found
Failed to get surface from fd = -1
When this issue happens, num of T4 cards on host showed by nvidia-smi
often decreased from 3 to 2 or 1, while lspci | grep NV
shows all the cards are still there.
I am almost sure this error is realted with NvBufSurfaceParams::bufferDesc
defined in nvbufsurface.h
, for structure NvBufSurfaceParams
is used in our program.
I have googled and found that this error should only happen on Jestson platform. So my question is, is there any chance T4 is considered to be Jetson when some error like “losing card” happens?
btw., i don’t know what it exactly means to lose card
, and this problem is not easy to reproduce.
Thanks for help!