I am experiencing a weird vram memory allocation difference running same ffmpeg decoding with NVDEC in different GPU hardware.
A simple use of h264_cuvid like the following example:
ffmpeg -c:v h264_cuvid -surfaces 8 -f mpegts -i https://samples.ffmpeg.org/V-codecs/h264/HD-h264.ts -vcodec libx264 -preset veryfast -crf 23 -c:a copy -f mpegts transcoded.ts
allocates 153MB of VRAM in GTX 1070 Ti under Linux with drivers 390.77 (also tested 384.110) but it allocates only 87MB of VRAM in GTX 1050 Ti under Linux with same drivers.
Interestingly, GTX 1070 Ti under Windows allocates 132MB - less than Linux but more than GTX 1050 Ti. Unfortunately I couldn’t test 1050 Ti on Windows. Windows driver is 391 something (the one latest Windows 10 installs by default).
I have also tested memory allocation with AppDecode from Video SDK, and shows similar results.
205MB allocated on GTX 1070 Ti (higher because AppDec uses 20 surfaces), 139MB on GTX 1050 Ti.
Similar VRAM allocation differences happen in encoding too.
Is this a bug in the nvdec/nvenc libraries and SDK? Is this a bug in drivers? Or is this considered normal?
VRAM allocation for GTX 1070 Ti in Linux is quite high compared to both Windows and GTX 1050 Ti, thus limiting the number of concurrent decoding sessions - encoding is limited anyway in non Quadro.