Hi,
neighbor topic Linux after 520.56.06 drivers randomly segfault nvenc inside nvcuvid thread ignored long time, so i try in this place. After 520.56.06 nvidia drivers nvenc contain use after free bug. I check 525.125.06,535.54.03,535.86.05, 535.113.01, 545.23.06, cuda 12.2, ubuntu 22.04, GTX 1070 hardware. In attach, patch for Video_Codec_SDK_12.1.14 with Samples/AppEncode/AppEncCudaBug
new sample, this code create and release 300 encode sessions. valgrind detect invalid read. And code some time segfaulted.
reproduce:
src $ patch -p1 < Video_Codec_SDK_12.1.14_bug.patch.txt
src $ cd src/Video_Codec_SDK_12.1.14/Samples/AppEncode/AppEncCudaBug
src/Video_Codec_SDK_12.1.14/Samples/AppEncode/AppEncCudaBug $ mkdir build
src/Video_Codec_SDK_12.1.14/Samples/AppEncode/AppEncCudaBug $ cd build
src/Video_Codec_SDK_12.1.14/Samples/AppEncode/AppEncCudaBug/build $ cmake ..
src/Video_Codec_SDK_12.1.14/Samples/AppEncode/AppEncCudaBug/build $ make
src/Video_Codec_SDK_12.1.14/Samples/AppEncode/AppEncCudaBug/build $ ./AppEncCudaBug
src/Video_Codec_SDK_12.1.14/Samples/AppEncode/AppEncCudaBug/build $ valgrind --trace-children=yes --leak-check=full --log-file=valgrind.txt ./AppEncCudaBug
invalid read part valgrind.txt
==450484== Thread 10:
==450484== Invalid read of size 4
==450484== at 0x6A16F01: ??? (in /usr/lib64/libnvcuvid.so.545.23.06)
==450484== by 0x6A17039: ??? (in /usr/lib64/libnvcuvid.so.545.23.06)
==450484== by 0x6A863B5: ??? (in /usr/lib64/libnvcuvid.so.545.23.06)
==450484== by 0x6A86B1C: ??? (in /usr/lib64/libnvcuvid.so.545.23.06)
==450484== by 0x787431B: start_thread (pthread_create.c:444)
==450484== by 0x78F76AF: clone (clone.S:100)
==450484== Address 0x26606c8c is 515,932 bytes inside an unallocated block of size 1,826,752 in arena "client"
==450484==
==450484== Invalid read of size 4
==450484== at 0x6A16F07: ??? (in /usr/lib64/libnvcuvid.so.545.23.06)
==450484== by 0x6A17039: ??? (in /usr/lib64/libnvcuvid.so.545.23.06)
==450484== by 0x6A863B5: ??? (in /usr/lib64/libnvcuvid.so.545.23.06)
==450484== by 0x6A86B1C: ??? (in /usr/lib64/libnvcuvid.so.545.23.06)
==450484== by 0x787431B: start_thread (pthread_create.c:444)
==450484== by 0x78F76AF: clone (clone.S:100)
==450484== Address 0x26606c78 is 515,912 bytes inside an unallocated block of size 1,826,752 in arena "client"
segfault:
AddressSanitizer:DEADLYSIGNAL
=================================================================
==444460==ERROR: AddressSanitizer: SEGV on unknown address 0x7f3bf400a40c (pc 0x7f3c02c16f01 bp 0x7f3c0378b5a8 sp 0x7f3bd3e386f0 T-1)
==444460==The signal is caused by a READ memory access.
#0 0x7f3c02c16f01 (/usr/lib64/libnvcuvid.so.1+0x16f01) (BuildId: e45304d759eabb77c567f3332a917d9eb61ab913)
#1 0x7f3c02c17039 (/usr/lib64/libnvcuvid.so.1+0x17039) (BuildId: e45304d759eabb77c567f3332a917d9eb61ab913)
#2 0x7f3c02c863b5 (/usr/lib64/libnvcuvid.so.1+0x863b5) (BuildId: e45304d759eabb77c567f3332a917d9eb61ab913)
#3 0x7f3c02c86b1c (/usr/lib64/libnvcuvid.so.1+0x86b1c) (BuildId: e45304d759eabb77c567f3332a917d9eb61ab913)
#4 0x7f3c026b231b in start_thread /var/tmp/portage/sys-libs/glibc-2.37-r7/work/glibc-2.37/nptl/pthread_create.c:444
#5 0x7f3c0273579b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/usr/lib64/libnvcuvid.so.1+0x16f01) (BuildId: e45304d759eabb77c567f3332a917d9eb61ab913)
==444460==ABORTING
Video_Codec_SDK_12.1.14_bug.patch.txt (10.8 KB)