Given an input video called “input.yuv”, which is a set of raw YUV420p frames, I would like to produce an output h264 video with no individual frame exceeding “X” bits. An extra condition is that the output video must consist of only the initial I-frame and subsequent P-frames (i.e. iframeinterval > input frames, and no-B-Frames=false).
Since it seems like there are no other ways to control the maximum frame size beyond setting an average bitrate, I tested out the vbv-buffer size, which should theoretically specify the size of the virtual buffer while encoding. From my understanding, it should specify the maximum amount of bits in any one frame. It is defined as Virtual Buffer Size = vbv-size * (bitrate/fps), so at 10 Mbps and 30 fps, vbv-size=1 should produce a virtual buffer size of 333 kb.
Here is my GStreamer pipeline to encode the raw video.
BITRATE=10000000
GOPSIZE=400
VBVSIZE=30
gst-launch-1.0 filesrc location=input.yuv \
! videoparse width=1920 height=1080 framerate=2997/100 format=i420 \
! omxh264enc bitrate=$BITRATE no-B-Frames=false iframeinterval=$GOPSIZE vbv-size=$VBVSIZE \
! 'video/x-h264, stream-format=(string)byte-stream' \
! h264parse \
! qtmux \
! filesink location=framesize-limit.mp4
And to inspect frames we can use ffprobe:
ffprobe framesize-limit.mp4 -show_frames | grep pkt_size
And further, to gather the frame sizes in bytes to a file:
ffprobe framesize-limit.mp4 -show_frames | grep pkt_size | cut -d = -f 2 > framesizes.csv
Experiment:
- Modulate vbv-size to 1 and 30.
- Modulate iframeinterval to 30 and over 300.
- Modulate bitrate to 10 Mbps and 500 kbps.
- Use ffprobe to dump frame data and inspect frame size
Results:
At a given iframeinterval and bitrate, vbv-size modulation does not change the resulting bits used to encode each frame. In fact, the frame sizes are identical. In the case of a high-motion video, the first 30 frames (1 second at 29.97 fps) are almost 50% over the specified bitrate. The first frame, for example, was over the 333 kb virtual buffer size, and was well over 400 kb.
I would like to know how I can cap the number of bits any given frame uses to some constant, say “X”. How can I set the maximum bits per frame when encoding with omxh264enc?
Notes:
- The polarity of the no-B-Frames seems backwards, but a test reveals that 'true' results in B-frames, while 'false' results in no B-frames.