the problem seems fairly basic : i’d like to create thumbnails from incoming video in the shortest time possible, and i’m trying to do this by offloading processing to an nvidia gpu.
while i run ffmpeg, i’m monitoring the gpu usage with the nvidia-smi utility. gpu usage never goes above 15% and the amount of time to encode the thumbnails with gpu is only 10% less than the time required without the gpu. these performance levels are very disappointing.
my question : am i going about this the wrong way (and if so, how should i go about it), or is this gpu performance ‘normal’/‘reasonable’ ?
SYSTEM INFORMATION
the machine is a desktop pc running windows 10, 8gb ram, intel i7-7700. the gpu is an nvidia quadro pro 4000 with cuda 11.4 installed. ffmpeg is version N-101372-gb5cb8c8767-g2fc309e699+4 (2021) running under mingw, with --enable-cuda --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-libnpp --enable-nvdec and --enable-nvenc .
i’ve varied the above by supplementing some cuda-related parameters according to posts i’ve read here on stackoverflow and on the nvidia transcoding guide, but haven’t been able to improve performance. adding any of -hwaccel cuda, -hwaccel cuvid, -hwaccel nvenc at the beginning of line 3 results in the error : Impossible to convert between the formats supported by the filter 'graph 0 input from stream 0:0' and the filter 'auto_scaler_0'
Can you clarify what the GPU is? Quadro 4000 Pro doesn’t mean anything to me. I am aware of the Turing-based Quadro RTX 4000 and also the Fermi-based Quadro 4000 from about ten years ago. However, the latter is no longer supported by modern versions of CUDA (including CUDA 11.4).
here is the output of nvidia-smi -q that you asked for :
==============NVSMI LOG==============
Timestamp : Thu Aug 5 15:55:26 2021
Driver Version : 471.41
CUDA Version : 11.4
Attached GPUs : 1
GPU 00000000:01:00.0
Product Name : Quadro P4000
Product Brand : Quadro
Display Mode : Enabled
Display Active : Enabled
Persistence Mode : N/A
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : WDDM
Pending : WDDM
Serial Number : 0320518021395
GPU UUID : GPU-e40fbe22-c38d-df0f-df28-63589735ce43
Minor Number : N/A
VBIOS Version : 86.04.56.00.0b
MultiGPU Board : No
Board ID : 0x100
GPU Part Number : 900-5G410-2750-000
Module ID : 0
Inforom Version
Image Version : G410.0501.00.03
OEM Object : 1.1
ECC Object : N/A
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GSP Firmware Version : N/A
GPU Virtualization Mode
Virtualization Mode : None
Host VGPU Mode : N/A
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x01
Device : 0x00
Domain : 0x0000
Device Id : 0x1BB110DE
Bus Id : 00000000:01:00.0
Sub System Id : 0x11A310DE
GPU Link Info
PCIe Generation
Max : 3
Current : 1
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 4000 KB/s
Rx Throughput : 66000 KB/s
Fan Speed : 46 %
Performance State : P8
Clocks Throttle Reasons
Idle : Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
FB Memory Usage
Total : 8192 MiB
Used : 955 MiB
Free : 7237 MiB
BAR1 Memory Usage
Total : 256 MiB
Used : 2 MiB
Free : 254 MiB
Compute Mode : Default
Utilization
Gpu : 24 %
Memory : 24 %
Encoder : 0 %
Decoder : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
Ecc Mode
Current : N/A
Pending : N/A
ECC Errors
Volatile
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Aggregate
Single Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Double Bit
Device Memory : N/A
Register File : N/A
L1 Cache : N/A
L2 Cache : N/A
Texture Memory : N/A
Texture Shared : N/A
CBU : N/A
Total : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows : N/A
Temperature
GPU Current Temp : 34 C
GPU Shutdown Temp : 96 C
GPU Slowdown Temp : 93 C
GPU Max Operating Temp : N/A
GPU Target Temperature : 83 C
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
Power Readings
Power Management : Supported
Power Draw : 12.10 W
Power Limit : 105.00 W
Default Power Limit : 105.00 W
Enforced Power Limit : 105.00 W
Min Power Limit : 60.00 W
Max Power Limit : 105.00 W
Clocks
Graphics : 25 MHz
SM : 25 MHz
Memory : 405 MHz
Video : 544 MHz
Applications Clocks
Graphics : 1202 MHz
Memory : 3802 MHz
Default Applications Clocks
Graphics : 1202 MHz
Memory : 3802 MHz
Max Clocks
Graphics : 1708 MHz
SM : 1708 MHz
Memory : 3802 MHz
Video : 1544 MHz
Max Customer Boost Clocks
Graphics : 1708 MHz
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Voltage
Graphics : N/A
Processes
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 1220
Type : C+G
Name : Insufficient Permissions
Used GPU Memory : Not available in WDDM driver model
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 6444
Type : C+G
Name : C:\Windows\explorer.exe
Used GPU Memory : Not available in WDDM driver model
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 6688
Type : C+G
Name : C:\Windows\SystemApps\MicrosoftWindows.Client.CBS_cw5n1h2txyewy\InputApp\TextInputHost.exe
Used GPU Memory : Not available in WDDM driver model
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 7272
Type : C+G
Name : C:\Program Files\WindowsApps\Microsoft.ZuneVideo_10.20112.10111.0_x64__8wekyb3d8bbwe\Video.UI.exe
Used GPU Memory : Not available in WDDM driver model
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 7356
Type : C+G
Name : C:\Windows\SystemApps\Microsoft.Windows.StartMenuExperienceHost_cw5n1h2txyewy\StartMenuExperienceHost.exe
Used GPU Memory : Not available in WDDM driver model
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 7716
Type : C+G
Name : C:\Windows\SystemApps\Microsoft.Windows.Search_cw5n1h2txyewy\SearchApp.exe
Used GPU Memory : Not available in WDDM driver model
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 8532
Type : C+G
Name : C:\Windows\SystemApps\Microsoft.LockApp_cw5n1h2txyewy\LockApp.exe
Used GPU Memory : Not available in WDDM driver model
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 10212
Type : C+G
Name : C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe
Used GPU Memory : Not available in WDDM driver model
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 10472
Type : C+G
Name : C:\Windows\SystemApps\ShellExperienceHost_cw5n1h2txyewy\ShellExperienceHost.exe
Used GPU Memory : Not available in WDDM driver model
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 12176
Type : C+G
Name : C:\Program Files\WindowsApps\Microsoft.Windows.Photos_2020.20120.4004.0_x64__8wekyb3d8bbwe\Microsoft.Photos.exe
Used GPU Memory : Not available in WDDM driver model
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 12256
Type : C+G
Name : C:\Windows\ImmersiveControlPanel\SystemSettings.exe
Used GPU Memory : Not available in WDDM driver model
Yes, a Quadro P4000 is a Pascal-based GPU and as such is supported by recent versions of CUDA. I have no experience with transcoding, but maybe precise knowledge of the GPU type allows someone else to recommend a further course of action.