Hi,
on linux(ubuntu 22.04, gentoo) after 520.56.06 nvidia drivers my custom nvenc solution randomly segfault probaly on fast sessions, like test cases in ci. I check 525.125.06,535.54.03,535.86.05 cuda 12.2, all broken on my case. In gdb looks always same bt place. And i partial reproduce problem without segfault, but with valgrind warning looks same. Attached patch for Video_Codec_SDK_12.1.14, with example, she in thread loop create nvenc contexts and free its after random 100…300ms time. Reproduce
valgrind --trace-children=yes --leak-check=full --log-file=valgrind.txt AppEncode/AppEncCudaBug/AppEncCudaBug
AppEncCudaBug.patch (10.1 KB)
valgrind:
==964328== Invalid read of size 4
==964328== at 0x6A16D81: ??? (in /usr/lib64/libnvcuvid.so.535.86.05)
==964328== by 0x6A16EB9: ??? (in /usr/lib64/libnvcuvid.so.535.86.05)
==964328== by 0x6A85835: ??? (in /usr/lib64/libnvcuvid.so.535.86.05)
==964328== by 0x6A85F9C: ??? (in /usr/lib64/libnvcuvid.so.535.86.05)
==964328== by 0x78282DB: start_thread (pthread_create.c:444)
==964328== by 0x78AB69F: clone (clone.S:100)
==964328== Address 0xc28024c is 134,156 bytes inside a block of size 230,056 free'd
==964328== at 0x484310E: free (vg_replace_malloc.c:974)
==964328== by 0x660BD4D: ??? (in /usr/lib64/libnvidia-encode.so.535.86.05)
==964328== by 0x6605759: ??? (in /usr/lib64/libnvidia-encode.so.535.86.05)
==964328== by 0x661E7CD: ??? (in /usr/lib64/libnvidia-encode.so.535.86.05)
==964328== by 0x1321B1: NvEncoder::DestroyHWEncoder() (in /home/hizel/src/Video_Codec_SDK_12.1.14.bug/Samples/build/AppEncode/AppEncCudaBug/AppEncCudaBug)
==964328== by 0x12F651: NvEncoder::~NvEncoder() (in /home/hizel/src/Video_Codec_SDK_12.1.14.bug/Samples/build/AppEncode/AppEncCudaBug/AppEncCudaBug)
==964328== by 0x13F3EF: NvEncoderCuda::~NvEncoderCuda() (in /home/hizel/src/Video_Codec_SDK_12.1.14.bug/Samples/build/AppEncode/AppEncCudaBug/AppEncCudaBug)
==964328== by 0x13F40B: NvEncoderCuda::~NvEncoderCuda() (in /home/hizel/src/Video_Codec_SDK_12.1.14.bug/Samples/build/AppEncode/AppEncCudaBug/AppEncCudaBug)
==964328== by 0x115755: main::{lambda()#1}::operator()() const (in /home/hizel/src/Video_Codec_SDK_12.1.14.bug/Samples/build/AppEncode/AppEncCudaBug/AppEncCudaBug)
==964328== by 0x115DF5: void std::__invoke_impl<void, main::{lambda()#1}&>(std::__invoke_other, main::{lambda()#1}&) (in /home/hizel/src/Video_Codec_SDK_12.1.14.bug/Samples/build/AppEncode/AppEncCudaBug/AppEncCudaBug)
==964328== by 0x115CE9: std::enable_if<std::__and_<std::is_void<void>, std::__is_invocable<main::{lambda()#1}&> >::value, void>::type std::__invoke_r<void, main::{lambda()#1}&>(main::{lambda()#1}&) (in /home/hizel/src/Video_Codec_SDK_12.1.14.bug/Samples/build/AppEncode/AppEncCudaBug/AppEncCudaBug)
==964328== by 0x115BCD: std::_Function_handler<void (), main::{lambda()#1}>::_M_invoke(std::_Any_data const&) (in /home/hizel/src/Video_Codec_SDK_12.1.14.bug/Samples/build/AppEncode/AppEncCudaBug/AppEncCudaBug)
==964328== Block was alloc'd at
==964328== at 0x4840797: malloc (vg_replace_malloc.c:431)
==964328== by 0x6605958: ??? (in /usr/lib64/libnvidia-encode.so.535.86.05)
==964328== by 0x661C8CC: ??? (in /usr/lib64/libnvidia-encode.so.535.86.05)
==964328== by 0x13168C: NvEncoder::CreateEncoder(_NV_ENC_INITIALIZE_PARAMS const*) (in /home/hizel/src/Video_Codec_SDK_12.1.14.bug/Samples/build/AppEncode/AppEncCudaBug/AppEncCudaBug)
==964328== by 0x1156D6: main::{lambda()#1}::operator()() const (in /home/hizel/src/Video_Codec_SDK_12.1.14.bug/Samples/build/AppEncode/AppEncCudaBug/AppEncCudaBug)
==964328== by 0x115DF5: void std::__invoke_impl<void, main::{lambda()#1}&>(std::__invoke_other, main::{lambda()#1}&) (in /home/hizel/src/Video_Codec_SDK_12.1.14.bug/Samples/build/AppEncode/AppEncCudaBug/AppEncCudaBug)
==964328== by 0x115CE9: std::enable_if<std::__and_<std::is_void<void>, std::__is_invocable<main::{lambda()#1}&> >::value, void>::type std::__invoke_r<void, main::{lambda()#1}&>(main::{lambda()#1}&) (in /home/hizel/src/Video_Codec_SDK_12.1.14.bug/Samples/build/AppEncode/AppEncCudaBug/AppEncCudaBug)
==964328== by 0x115BCD: std::_Function_handler<void (), main::{lambda()#1}>::_M_invoke(std::_Any_data const&) (in /home/hizel/src/Video_Codec_SDK_12.1.14.bug/Samples/build/AppEncode/AppEncCudaBug/AppEncCudaBug)
==964328== by 0x1242EB: std::function<void ()>::operator()() const (in /home/hizel/src/Video_Codec_SDK_12.1.14.bug/Samples/build/AppEncode/AppEncCudaBug/AppEncCudaBug)
==964328== by 0x1152D2: ThreadPool::ThreadLoop() (in /home/hizel/src/Video_Codec_SDK_12.1.14.bug/Samples/build/AppEncode/AppEncCudaBug/AppEncCudaBug)
==964328== by 0x12E6C7: void std::__invoke_impl<void, void (ThreadPool::*)(), ThreadPool*>(std::__invoke_memfun_deref, void (ThreadPool::*&&)(), ThreadPool*&&) (in /home/hizel/src/Video_Codec_SDK_12.1.14.bug/Samples/build/AppEncode/AppEncCudaBug/AppEncCudaBug)
==964328== by 0x12E626: std::__invoke_result<void (ThreadPool::*)(), ThreadPool*>::type std::__invoke<void (ThreadPool::*)(), ThreadPool*>(void (ThreadPool::*&&)(), ThreadPool*&&) (in /home/hizel/src/Video_Codec_SDK_12.1.14.bug/Samples/build/AppEncode/AppEncCudaBug/AppEncCudaBug)
gdb:
#0 0x00007f78dfa16d81 in ?? () from /usr/lib64/libnvcuvid.so.1
#1 0x00007f78dfa16eba in ?? () from /usr/lib64/libnvcuvid.so.1
#2 0x00007f78dfa85806 in ?? () from /usr/lib64/libnvcuvid.so.1
#3 0x00007f78dfa85f6d in ?? () from /usr/lib64/libnvcuvid.so.1
#4 0x00007f79d568e2dc in start_thread (arg=<optimized out>) at pthread_create.c:444
#5 0x00007f79d571178c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
0x7f78dfa00000 0x7f78e02f7000 0x8f7000 0x0 /usr/lib64/libnvcuvid.so.535.54.03
0x7f78e02f7000 0x7f78e04f6000 0x1ff000 0x8f7000 /usr/lib64/libnvcuvid.so.535.54.03
0x7f78e04f6000 0x7f78e053f000 0x49000 0x8f6000 /usr/lib64/libnvcuvid.so.535.54.03
0x7f78e053f000 0x7f78e0540000 0x1000 0x93f000 /usr/lib64/libnvcuvid.so.535.54.03