IDR frames and gop_size configuration in nvenc_h264

there was funny thing that we noticed when encoding our video file using nvenc_h264.

for a 30fps video, we extracted frames one second at a time, and encoded them independently, and no B frames by the way.

so we decoded the first 30 frames(frames from number 0 to number 29), and newed an nvenc encoder, and set the gop_size as 30, and set forced-idr as 1, and set the first frame as keyframe by seting the pict_type as I, and the rest frames as P frames, and encoded them.

then we decoded the frames from number 30 to number 59, and did the same thing to them.

we repeated this for the first 10s, and the total length of the output bitstream was, say X.

encode frame0(key), frame1(P), fram2(P), ..., frame29(P) => bitsteam x0
encode frame30(key), frame31(P), fram32(P), ..., frame59(P) => bitsteam x1
...
encode frame270(key), frame271(P), fram272(P), ..., frame299(P) => bitsteam x9

X = x1 + x2 + ... + x9

and then we did something differently, we decoded first 900 frames at once, and newed an nvenc encoder, and set the gop_size as 30, and set forced-idr as 1, and set every 30 frame as keyframe and others as P frames, and encoded those 900 frames directly, and the the total length of the output bitstream was, say Y.

encode frame0(key), frame1(P), frame2(P), ..., frame30(key), frame31(P), ..., frame270(key),  ..., frame299(P) => Y

and X was not equal to Y, which was counterintuitive to me, why?

as far as we know,

  1. IDR framesonly use intra-frame prediction to compress data.

  2. IDRs reset the decoder state, allowing for a clean start in video decoding.

  3. IDRs, thus, are associated with Closed GOPs – where pictures that follow the IDR cannot refer to pictures that come before the IDR.

based on thoes facts above, in theory, encoding every GOP independently using a new encoder is equivalent to encoding the whole video file using one encoder setting gop_size and forced_dir.

because when encountering an IDR frame, the encoder will just reset its state, like clean its buffer, etc.

am i correct? is there anyone would like to explain why?

You are not correct. CRA exists in avc too. Clean random access. Those are also keyframes, while not all I frames are keyframes, only IDR.