Segfault in nvv4l2h264enc (minimal reproducible example included)

• Hardware Platform (GPU)
• DeepStream Version 7.1
• NVIDIA GPU Driver Version (valid for GPU only) 550, 560, 570
• Issue Type (bugs)
• How to reproduce the issue ? GitHub - lumeohq/deepstream-encoder-segfault

Hello!

We see frequent segfaults when running multiple pipelines in multiple threads in the same process. In some conditions, the program segfaults at 80% chance in the first 5 seconds. It is quite easily reproducible. We tested this on 2 different machines with different drivers. 535 driver is not affected, but 550, 560 and 570 are affected.

More details in the readme at GitHub. MRE contains a script that runs a simple C application using nvcr.io/nvidia/deepstream:7.1-triton-multiarch docker image.

GDB backtrace:

Thread 32 "videotestsrc2:s" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x732e31000640 (LWP 222)]
0x0000732e71417941 in ?? () from /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1

I used the 570 driver on 3070TI to test the case you provided, and it can exit normally without any crash.

wget https://us.download.nvidia.com/tesla/570.133.20/nvidia-driver-local-repo-ubuntu2204-570.133.20_1.0-1_amd64.deb
sudo dpkg -i nvidia-driver-local-repo-ubuntu2204-570.133.20_1.0-1_amd64.deb
sudo cp /var/nvidia-driver-local-repo-ubuntu2204-570.133.20/nvidia-driver-local-6AA56764-keyring.gpg /usr/share/keyrings/

sudo apt update
sudo apt install cuda-drivers

However, 3070TI has a limit on the number of encoder instances. You can refer to this table

Are you testing on ubuntu2204? We have only tested DS-7.1 on ubuntu2204

We have Ubuntu 22.04 in prod, and on my dev machine it is Mint 22.1 (similar to Ubuntu 24.04), it crashes too. The driver is installed from Linux Mint driver manager: 570.144 (open).

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.144                Driver Version: 570.144        CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3070 Ti     Off |   00000000:0A:00.0  On |                  N/A |
|  0%   48C    P8             28W /  310W |    1340MiB /   8192MiB |      8%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

I also tried installing using NVIDIA-Linux-x86_64-570.144.run file downloaded from NVIDIA, and the test fails.

Have you tried running run.sh multiple times? There’s some probability of failing, for instance I ran it 5 times:

Running in normal mode...
Exit code: 139
...
Running in normal mode...
Exit code: 0
...
Running in normal mode...
Exit code: 0
...
Running in normal mode...
Exit code: 0
...
Running in normal mode...
Exit code: 139

You can also run

sudo apt install python3-tabulate
./table.py

.. and leave for half an hour. If the resulting table is all zeros, then it works well on your setup.