FFMPEG doesn't detect CUDA-capable device [cuvidGetDecoderCaps(&caps) failed]

Environment :
GPU : GTX 750 Ti
Driver : 460.32.03
CUDA : 11.2
FFMPEG : 4.4
OS : Debian 10 (Headless)

Hello,

I am trying to implement NVENC hardware acceleration through ffmpeg. However, when I try to encode a H264 file through CUDA acceleration, the following error is shown :

[h264 @ 0x5639d6d01a80] decoder->cvdl->cuvidGetDecoderCaps(&caps) failed -> CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
[h264 @ 0x5639d6d01a80] Failed setup for format cuda: hwaccel initialisation returned error.

The issue is that the error keeps showing up even though the GPU seems to be detected as CUDA-capable.

./deviceQuery Output :

./deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: “GeForce GTX 750 Ti”
CUDA Driver Version / Runtime Version 11.2 / 11.2
CUDA Capability Major/Minor version number: 5.0
Total amount of global memory: 1999 MBytes (2096168960 bytes)
( 5) Multiprocessors, (128) CUDA Cores/MP: 640 CUDA Cores
GPU Max Clock rate: 1163 MHz (1.16 GHz)
Memory Clock rate: 2750 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 2097152 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total shared memory per multiprocessor: 65536 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Managed Memory: Yes
Device supports Compute Preemption: No
Supports Cooperative Kernel Launch: No
Supports MultiDevice Co-op Kernel Launch: No
Device PCI Domain ID / Bus ID / location ID: 0 / 6 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.2, CUDA Runtime Version = 11.2, NumDevs = 1
Result = PASS

nvidia-smi :
image

Full output of the transcode command :
ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i input.mp4 -c:v h264_nvenc output.mp4

ffmpeg version N-101924-g282682a9fd Copyright (c) 2000-2021 the FFmpeg developers
built with gcc 8 (Debian 8.3.0-6)
configuration: --prefix=/home/ysh/ffmpeg_build --pkg-config-flags=–static --extra-cflags=-I/home/ysh/ffmpeg_build/include --extra-ldflags=-L/home/ysh/ffmpeg_build/lib --extra-libs=’-lpthread -lm’ --ld=g++ --bindir=/usr/local/bin --enable-gpl --enable-gnutls --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libx264 --enable-libx265 --enable-nonfree --enable-cuda --enable-cuvid --enable-nvdec --enable-nvenc --enable-nonfree --enable-libnpp --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64
libavutil 56. 72.100 / 56. 72.100
libavcodec 58.136.100 / 58.136.100
libavformat 58. 78.100 / 58. 78.100
libavdevice 58. 14.100 / 58. 14.100
libavfilter 7.111.100 / 7.111.100
libswscale 5. 10.100 / 5. 10.100
libswresample 3. 10.100 / 3. 10.100
libpostproc 55. 10.100 / 55. 10.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from ‘input.mp4’:
Metadata:
major_brand : isom
minor_version : 1
compatible_brands: isomavc1
creation_time : 2007-05-09T07:55:25.000000Z
Duration: 00:01:29.22, start: 0.000000, bitrate: 7490 kb/s
Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 1920x816, 7403 kb/s, 23.98 fps, 23.98 tbr, 24k tbn, 47.95 tbc (default)
Metadata:
creation_time : 2007-05-09T07:55:25.000000Z
handler_name : GPAC ISO Video Handler
vendor_id : [0][0][0][0]
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 94 kb/s (default)
Metadata:
creation_time : 2007-05-09T07:55:29.000000Z
handler_name : GPAC ISO Audio Handler
vendor_id : [0][0][0][0]
Stream mapping:
Stream #0:0#0:0 (h264 (native) → h264 (h264_nvenc))
Stream #0:1#0:1 (aac (native) → aac (native))
Press [q] to stop, [?] for help
[h264 @ 0x555bbe82fa80] decoder->cvdl->cuvidGetDecoderCaps(&caps) failed → CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
[h264 @ 0x555bbe82fa80] Failed setup for format cuda: hwaccel initialisation returned error.
Segmentation Fault

Thanks to anyone who could have a clue about the origin of this issue.