Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU)
• DeepStream Version
• TensorRT Version
TensorRT 8.4.1-1 (same as docker image deepstream:6.1.1-base)
• NVIDIA GPU Driver Version (valid for GPU only)
• Issue Type( questions, new requirements, bugs)
We’ve been using DeepStream 6.1.1 on servers with 3~4 T4 GPU cards. The hosts Nvidia Drvier has been updated to a quite new version 525.85.12.
Recently we encountered a core dump issue, which happens randomly. Belowing is the error message:
nvbuf_utils: dmabuf_fd -1 mapped entry NOT found
Failed to get surface from fd = -1
When this issue happens, num of T4 cards on host showed by
nvidia-smi often decreased from 3 to 2 or 1, while
lspci | grep NV shows all the cards are still there.
I am almost sure this error is realted with
NvBufSurfaceParams::bufferDesc defined in
nvbufsurface.h, for structure
NvBufSurfaceParams is used in our program.
I have googled and found that this error should only happen on Jestson platform. So my question is, is there any chance T4 is considered to be Jetson when some error like “losing card” happens?
btw., i don’t know what it exactly means to
lose card, and this problem is not easy to reproduce.
Thanks for help!