FFMPEG GPU Transcoding Performance with Data Stream

Hello! I am trying to transcode an H265 UDP Transport Stream and output an H264 UDP Transport Stream. I started with the following line from Using FFmpeg with NVIDIA GPU Hardware Acceleration :: NVIDIA Video Codec SDK Documentation

ffmpeg -y -vsync 0 -hwaccel cuda -hwaccel_output_format cuda -i input.mp4 -c:a copy -c:v h264_nvenc -preset p2 -tune ll -b:v 5M -bufsize 5M -maxrate 10M -qmin 0 -g 250 -bf 3 -b_ref_mode middle -temporal-aq 1 -rc-lookahead 20 -i_qfactor 0.75 -b_qfactor 1.1 output.mp4

I then modified it to be the following:

ffmpeg -y -vsync 0 -hwaccel cuda -hwaccel_output_format cuda -i udp://xxx.xxx.xxx.xxx:xxxx -map 0 -c:v h264_nvenc -preset p2 -tune ll -maxrate 10M -qmin 0 -g 250 -bf 3 -b_ref_mode middle -temporal-aq 1 -rc-lookahead 20 -i_qfactor 0.75 -b_qfactor 1.1 -g 15 -f mpegs udp://xxx.xxx.xxx.xxx:yyyy

The primary changes were changing to MPEGTS input/output, setting GOP to 15 frames, maintaining the same bitrate as input, and mapping the data stream.

The latency is very low when I just transcode video. The problem I’m finding is a massive delay (about 10 seconds) when I add the -map 0 to pass the data stream(s) forward with the transcoded video. Is there a better way to pass the data streams forward? I assume the delay is in re-muxing the stream together? Thank you!

FYI - I’m running this on a DGX server with V100s using a version of FFMPEG 4.

This has been figured out - there is a setting called max_interleave_delta that sets how long the mixing queue is effectively. See more information here: FFmpeg Formats Documentation

This can be closed now.

Why are you using -vsync 0? 0 option for it is deprecated and besides all the problems that require vsync are fixed now.

It was included in the docs, I found that it doesn’t matter if I include or not, so I haven’t been utilizing vsync lately. Thanks for flagging that!