continuously using h264 cuvid with h264_nvenc makes the encoding process hang

Using ffmpeg with cuda and encoding files continuously with the following options (through a bash script), makes the encoding process hang. The whole system becomes unresponsive and a reboot through ssh is needed. Sometimes it hangs on the first file it accesses and sometimes on the later files. But it always hangs with processing fewer than 5 or 6 files.

When it hangs, nvidia-smi outputs this:

gpu pwr temp sm mem enc dec mclk pclk

Idx W C % % % % MHz MHz

0    51    53     0     0    50     0  3802    90
0    51    54     0     0    50     0  3802    90

The latest versions of the drivers and video sdk and cuda sdk are installed.

######################################################################################
ffmpeg version N-85093-g7942907 Copyright © 2000-2017 the FFmpeg developers
built with gcc 5.4.0 (Gentoo 5.4.0 p1.0, pie-0.6.5)
configuration: --prefix=/home/tefid/Software/ffmpeg-cuda/ --libdir=/home/tefid/Software/ffmpeg-cuda/lib --shlibdir=/home/tefid/Software/ffmpeg-cuda/lib --docdir=/home/tefid/Software/ffmpeg-cuda/doc --mandir=/home/tefid/Software/ffmpeg-cuda/man --enable-static --disable-shared --cc=x86_64-pc-linux-gnu-gcc --cxx=x86_64-pc-linux-gnu-g++ --ar=x86_64-pc-linux-gnu-ar --optflags=’-O3 -march=native -pipe’ --enable-avfilter --enable-avresample --disable-stripping --enable-version3 --disable-indev=jack --enable-version3 --enable-version3 --enable-nonfree --extra-cflags=-I/opt/cuda/include --extra-ldflags=-L/opt/cuda/lib64/ --enable-bzlib --disable-runtime-cpudetect --disable-debug --disable-gcrypt --disable-gnutls --enable-gmp --enable-gpl --enable-hardcoded-tables --enable-iconv --enable-lzma --enable-network --disable-openssl --enable-postproc --enable-libsmbclient --enable-ffplay --enable-sdl2 --enable-vaapi --enable-vdpau --enable-xlib --enable-libxcb --enable-libxcb-shm --enable-libxcb-xfixes --enable-zlib --enable-libcdio --disable-libiec61883 --disable-libdc1394 --disable-libcaca --enable-openal --enable-opengl --disable-libv4l2 --enable-libpulse --enable-libopencore-amrwb --enable-libopencore-amrnb --enable-libfdk-aac --disable-libopenjpeg --disable-libbluray --disable-libcelt --disable-libgme --disable-libgsm --disable-mmal --disable-libmodplug --disable-libopus --disable-libilbc --disable-librtmp --enable-libssh --disable-libschroedinger --disable-libspeex --enable-libvorbis --enable-libvpx --disable-libzvbi --enable-libnpp --enable-cuvid --enable-cuda --disable-libbs2b --disable-chromaprint --disable-libflite --disable-frei0r --disable-libfribidi --disable-fontconfig --disable-ladspa --disable-libass --enable-libfreetype --disable-librubberband --disable-libzimg --disable-libsoxr --enable-pthreads --disable-libvo-amrwbenc --enable-libmp3lame --disable-libkvazaar --enable-nvenc --disable-libopenh264 --disable-libsnappy --enable-libtheora --disable-libtwolame --enable-libwavpack --disable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --disable-amd3dnow --disable-amd3dnowext --disable-fma4 --disable-xop --cpu=host --disable-doc --disable-htmlpages --enable-manpages
libavutil 55. 57.100 / 55. 57.100
libavcodec 57. 88.100 / 57. 88.100
libavformat 57. 70.100 / 57. 70.100
libavdevice 57. 5.100 / 57. 5.100
libavfilter 6. 81.100 / 6. 81.100
libavresample 3. 4. 0 / 3. 4. 0
libswscale 4. 5.100 / 4. 5.100
libswresample 2. 6.100 / 2. 6.100
libpostproc 54. 4.100 / 54. 4.100
Input #0, matroska,webm, from ‘/mnt/game2/Cartoon-hq/Bernard/04 - The Swing.mkv’:
Metadata:
encoder : libebml v0.8.0 + libmatroska v0.9.0
creation_time : 2010-09-16T08:50:05.000000Z
Duration: 00:03:24.88, start: 0.000000, bitrate: 592 kb/s
Chapter #0:0: start 0.105000, end 204.880000
Metadata:
title : 00:00:00.105
Stream #0:0(eng): Video: h264 (Main), yuv420p(progressive), 720x400 [SAR 1:1 DAR 9:5], 25 fps, 25 tbr, 1k tbn, 50 tbc
Stream #0:1: Audio: aac (HE-AAC), 44100 Hz, stereo, fltp (default)
Stream mapping:
Stream #0:0 -> #0:0 (h264 (h264_cuvid) -> h264 (h264_nvenc))
Stream #0:1 -> #0:1 (aac (native) -> aac (libfdk_aac))
Press [q] to stop, [?] for help
[libfdk_aac @ 0x38930e0] Note, the VBR setting is unsupported and only works with some parameter combinations
[h264_nvenc @ 0x3891f20] Defined rc_lookahead requires more surfaces, increasing used surfaces 32 -> 38
Output #0, mp4, to ‘/mnt/game2/cartoon-lq/Bernard/04 - The Swing.mp4’:
Metadata:
encoder : Lavf57.70.100
Chapter #0:0: start 0.105000, end 204.880000
Metadata:
title : 00:00:00.105
Stream #0:0(eng): Video: h264 (h264_nvenc) (High) ([33][0][0][0] / 0x0021), cuda, 432x240 [SAR 1:1 DAR 9:5], q=28-30, 2000 kb/s, 21 fps, 10752 tbn, 21 tbc
Metadata:
encoder : Lavc57.88.100 h264_nvenc
Side data:
cpb: bitrate max/min/avg: 0/0/2000000 buffer size: 4000000 vbv_delay: -1
Stream #0:1: Audio: aac (libfdk_aac) ([64][0][0][0] / 0x0040), 44100 Hz, stereo, s16 (default)
Metadata:
encoder : Lavc57.88.100 libfdk_aac
Past duration 0.639992 too large
Past duration 0.799995 too large
[h264_nvenc @ 0x3891f20] Failed locking bitstream buffer: invalid param (8)
Video encoding failed

And I use the following options for ffmpeg and cuda:

ffmpeg -hwaccel cuvid -c:v h264_cuvid -i -f mp4 -c:v h264_nvenc -profile:v high -preset slow -pixel_format yuv420p -vf scale_npp=-1:240 -rc vbr_2pass -rc-lookahead 32 -cq 30 -qmin 28 -qmax 30 -r 21 -spatial-aq 1 -aq-strength 12 -c:a libfdk_aac -vbr 1 -ar 44100 -ac 2

The Cuda device is a Pascal 1070 GTX. The strange thing is that using hevc_nvenc it works just fine and it never hangs with the same options.

########################Installed software########################

[I] x11-drivers/nvidia-drivers
Available versions: [M]96.43.23-r1(0/96)^msd [M]173.14.39-r1(0/173)^msd M173.14.39-r2(0/173)^msd 304.134(0/304)^md (~)304.134-r1(0/304)^md 304.135(0/304)^md 340.101(0/340)^md (~)340.101-r1(0/340)^md 340.102(0/340)^md 375.26(0/375)^md (~)375.26-r3(0/375)^md 375.39(0/375)^md 378.13(0/378)^md{tbz2} {+X acpi compat custom-cflags +driver gtk gtk3 +kms multilib pax_kernel static-libs (+)tools uvm wayland ABI_MIPS=“n32 n64 o32” ABI_PPC=“32 64” ABI_S390=“32 64” ABI_X86=“32 64 x32” KERNEL=“FreeBSD linux”}
Installed versions: 378.13^md{tbz2}(08:12:42 PM 04/04/2017)(X acpi driver gtk3 kms multilib static-libs tools uvm -compat -pax_kernel -wayland ABI_MIPS="-n32 -n64 -o32" ABI_PPC="-32 -64" ABI_S390="-32 -64" ABI_X86=“32 64 -x32” KERNEL=“linux -FreeBSD”)
Homepage: http://www.nvidia.com/ http://www.nvidia.com/Download/Find.aspx
Description: NVIDIA Accelerated Graphics Driver

[I] dev-util/nvidia-cuda-sdk
Available versions: 6.5.19^t (~)7.5.18^t (~)8.0.44-r1^t{tbz2} (~)8.0.61^t{tbz2} {+cuda debug +doc +examples mpi opencl}
Installed versions: 8.0.61^t{tbz2}(10:22:56 AM 04/03/2017)(cuda doc examples -debug -mpi -opencl)
Homepage: https://developer.nvidia.com/cuda-zone
Description: NVIDIA CUDA Software Development Kit

[I] dev-util/nvidia-cuda-toolkit
Available versions: 6.5.14(0/6.5.14) 6.5.19-r1(0/6.5.19) (~)7.5.18-r2(0/7.5.18) (~)8.0.44(0/8.0.44){tbz2} (~)8.0.61(0/8.0.61){tbz2} {debugger doc eclipse profiler}
Installed versions: 8.0.61{tbz2}(10:18:48 AM 04/03/2017)(-debugger -doc -eclipse -profiler)
Homepage: https://developer.nvidia.com/cuda-zone
Description: NVIDIA CUDA Toolkit (compiler and friends)

[I] media-video/nvidia_video_sdk
Available versions: 6.0.1{tbz2} 7.1.9{tbz2}[1] {tools}
Installed versions: 7.1.9{tbz2}[?](02:35:06 PM 04/09/2017)(-tools)
Homepage: https://developer.nvidia.com/nvidia-video-codec-sdk
Description: NVIDIA Video Codec SDK

Did you solved this problem? We have exactly same problem on 20 machines with Pascal GPU :(

Could you add parameter -surfaces 48 if it helps?

Also according to this link https://devtalk.nvidia.com/default/topic/1004971/nvenclockbitstream-sometimes-returns-nv_enc_err_invalid_param/?offset=1 it should be fixed in 381.09

I do not know yet. I should update the driver and see how it goes.
I gave up on this as NVIDIA was too proud to follow up on this. What a shame…