Fluctuating performance with two instances of NVENC

I encode AV1 video using NVENC on an RTX 4090, which has two hardware NVENC units.

When running a single instance of the encoder, I achieve a consistent performance of ~25fps. When running two instances of the encoder simultaneously (in hopes of making better use of the available hardware), I sometimes get a cumulative performance of ~50fps (which is what I would expect, since the card has two NVENC hardware units), but often the cumulative performance is much worse, somewhere between 25fps and 40fps.

I can’t find a reason for why the performance fluctuates so wildly in the dual encoder case. Running the same program several times gives me anything between 25fps and 50fps of total encoding speed. Nsight shows that the program keeps the two encoders busy at all times (nvEncEncodePicture calls show up back-to-back in the timeline without gaps in-between), so it can’t be due to not feeding input data fast enough.
I would be very grateful for any help here.