No Improvement Using Hwaccel for Video Transcoding with Ffmpeg

We are trying to speed up our video transcoding time with hardware acceleration. We are using a vm equipped with a gpu, specifically Nvidia’s Tesla K80. To do so, we are using ffmpeg configured with the h264_cuvid decoder and h264_nvenc encoder. However, we are not seeing as great of an improvement as we were expecting.

System specs:
Ubuntu 16.04
Nvidia Driver: 375.39

Machine type
n1-standard-8 (8 vCPUs, 30 GB memory)
CPU platform: Intel Haswell (2.3 GHz Intel Xeon E5 v3)
GPUs: 1 x NVIDIA Tesla K80

Sample input:
input.mp4 (1280x720, YUV 4:2:0)
duration: 207 seconds
bitrate: 1258 kb/s

Sample commands & runtimes:

1 to 4 transcoding, CPU:

ffmpeg -y -i input.mp4 \
    -s 1280x720 -b:v 5M -c:a copy out_720p.mp4 \
    -s 640x480  -b:v 3M -c:a copy out_480p.mp4 \
    -s 320x240  -b:v 2M -c:a copy out_240p.mp4 \
    -s 160x128  -b:v 1M -c:a copy out_128p.mp4

time: 36.5 seconds

1 to 4 transcoding, GPU:

ffmpeg -y -vsync 0 -hwaccel cuvid -c:v h264_cuvid -i input.mp4 \
    -vf scale_npp=1280:720 -c:a copy -c:v h264_nvenc -b:v 5M out_720p.mp4 \
    -vf scale_npp=640:480  -c:a copy -c:v h264_nvenc -b:v 3M out_480p.mp4 \
    -vf scale_npp=320:240  -c:a copy -c:v h264_nvenc -b:v 2M out_240p.mp4 \
    -vf scale_npp=160:128  -c:a copy -c:v h264_nvenc -b:v 1M out_128p.mp4

time: 44.6 seconds

So as you can see we are seeing a speed improvement using CPUs over the GPU and not sure why. Perhaps we are missing something simple. We are using the same commands recommended in the pdf “Using Ffmpeg with NVIDIA GPU Hardware Acceleration” provided by Nvidia. Thanks for any assistance!

Hi, I was just looking at your post on “No Improvement Using Hwaccel for Video Transcoding with Ffmpeg” with great interest, as I am just looking into the possibly to do exactly this. Did you manage to sort this out, where did you get with this please?

Regards.

Unfortunately we never saw any improvement on the GPU so we are still doing transcoding strictly on the CPU.

Hi, could I ask, did you just use the commands in post one, or did you try / experiment with other commands / ffmpeg builds?

Regards.