Vi_capture_status function take about one-frame time latency

In the file kernel/nvidia/drivers/media/platform/tegra/camera/vi/vi5_fops.c , the vi5_capture_dequeue function calls vi_capture_status to wait for the capture request deployed in vi5_capture_enqueue to complete. The time spended is about 50ms, which is one-frame time (camera Output 20hz). When we increase the frame rate(25HZ), the time becomes 40ms. What is the mechanism in all of this? At the same time, this caused a relatively large delay. Is there a way to optimize it? thank you !
Best regrads!

Due to the driver need to feed the ring buffers(3 buffers) the return to user space.
Have a reference to below topic.

The changes in the post are for TX2. I added the print information. Xavier does not use the tegra_channel_ring_buffer function. Maybe their mechanisms are different?
In order to reduce the delay, I set the nbuffers in the vi5_channel_setup_queue function to 2, and the test delay is about 90ms (the time from the external trigger to the user space to get the data), if the nbuffers is set to 3, the time will increase by 50ms (one Frame), if nuffers is 4, the time will be increased by one more frame, can it be considered that the current 90ms is already the minimum delay? Because if nbuffers is less than 2, frames will be dropped

Yes, I think set the nbuffers to 2 was the minimum delay. How to you profile 90ms.

I registered an external interrupt(PWM trigger) line in the kernel driver, obtained the system time in the top half of the interrupt and saved it in a global array, and then assigned this timestamp to vb2_buf every time the vi5_capture_enqueue function was executed, and then used it at the application layer VIDIOC_DQBUF gets the image data (including the trigger timestamp) and gets the timestamp at this time, and the two timestamps are subtracted to get 90ms.There will be a difference between the interrupted trigger time(system time) and the actual exposure time. We try to get this time difference, but Limited by the screen refresh rate, the obtained value is not very accurate, but I think the actual value should be close to 90ms.Do you have any suggestions for improvement of this measurement method?

It seems that further optimization of this delay requires a relatively large amount of work, does nvidia have any plan? Any suggestions if we change it ourself?

hi ShaneCCC, any suggestions about it? your reply will be grateful, thank you very much!

I have concern for the profile. When the interrupt will issue? What I think the correct timestamp is the NVCSI receive the SOF time.

Yes, sof time is the time when the image arrives at nvcsi. We use this time (sof) to test and the delay is 50ms, but what we want to get is the time stamp of the exposure time, that is, the time stamp obtained in the interrupt. When the rising edge of PWM is generated, the camera is triggered to take pictures. At the same time, we get the time stamp through this rising edge (triggering external interrupt).

If sof time is used, the time from camera trigger->exposure->data transmission to nvcsi will not be known.The effect of time synchronization will be incorrect。

I think the latency should be start from the NVCSI receive the SOF. The SOF means sensor start to send data to NVCSI.

Yes, the delay of xavier itself is 50ms, is it possible to reduce it?

Using high frame rate like 120fps would be help on that I think.

Yes,I found that the delay is one-frame time, and increasing the frame rate will reduce the delay,But do you mean a high frame rate camera or a separate increase in frame rate? If it’s a high frame rate camera, there’s nothing we can do :( ,If we can increase the frame rate alone, what do we need to do?

If your sensor can increase the frame rate you can configure the REG to output higher fps as default output to reduce the delay.

I understand i. But the frame rate and delay are related. Now our camera is 20hz-90ms delay, which is about two frames late. If the frame rate is increased to 40hz, the delay may be reduced by 25ms, but in fact it is still two frames late. If there is no way to reduce it, it can only be used in this way.
Are there other ways to get images, such as argus? if we use argus, do we have to use nvidia isp? We are now using a front ISP camera (GMSL), which is connected to the CSI after passing through the ser-des.

ARGUS only support BAYER sensor. And yes it must include the ISP pipeline that tell the latency would be more than v4l.

I got it ,thank you very much!