We are trying to speed up our video transcoding time with hardware acceleration. We are using a vm equipped with a gpu, specifically Nvidia’s Tesla K80. To do so, we are using ffmpeg configured with the h264_cuvid decoder and h264_nvenc encoder. However, we are not seeing as great of an improvement as we were expecting.
Nvidia Driver: 375.39
n1-standard-8 (8 vCPUs, 30 GB memory)
CPU platform: Intel Haswell (2.3 GHz Intel Xeon E5 v3)
GPUs: 1 x NVIDIA Tesla K80
input.mp4 (1280x720, YUV 4:2:0)
duration: 207 seconds
bitrate: 1258 kb/s
Sample commands & runtimes:
1 to 4 transcoding, CPU:
ffmpeg -y -i input.mp4 \ -s 1280x720 -b:v 5M -c:a copy out_720p.mp4 \ -s 640x480 -b:v 3M -c:a copy out_480p.mp4 \ -s 320x240 -b:v 2M -c:a copy out_240p.mp4 \ -s 160x128 -b:v 1M -c:a copy out_128p.mp4
time: 36.5 seconds
1 to 4 transcoding, GPU:
ffmpeg -y -vsync 0 -hwaccel cuvid -c:v h264_cuvid -i input.mp4 \ -vf scale_npp=1280:720 -c:a copy -c:v h264_nvenc -b:v 5M out_720p.mp4 \ -vf scale_npp=640:480 -c:a copy -c:v h264_nvenc -b:v 3M out_480p.mp4 \ -vf scale_npp=320:240 -c:a copy -c:v h264_nvenc -b:v 2M out_240p.mp4 \ -vf scale_npp=160:128 -c:a copy -c:v h264_nvenc -b:v 1M out_128p.mp4
time: 44.6 seconds
So as you can see we are seeing a speed improvement using CPUs over the GPU and not sure why. Perhaps we are missing something simple. We are using the same commands recommended in the pdf “Using Ffmpeg with NVIDIA GPU Hardware Acceleration” provided by Nvidia. Thanks for any assistance!