Accelerating Argus ISP processing

Hello,

We understand from various forum posts (e.g., Nvarguscamerasrc queue-size) that 3A algorithms and other calculations take up approx. 4-5 frames latency when passing through Argus.

  1. Are there any settings on the Argus API side that would reduce this latency? For example, will latency reduce when disabling AWB and AE, or other settings? Anything we can do on the API side?

  2. In the current libArgus it seems that each image gets the latest 3A settings before being passed to the user (after 4-5 frames). Would it be possible to add support for an alternative mode where the image is returned to the user quicker, but with “older” (with 3 or 4 frames of latency) 3A settings?

Our goal is to reduce the processing latency for live streaming use cases. Thanks.

Sorry, don’t have plan for this currently.
Maybe you can consider using v4l2 pipeline for better latency. Try using YUV sensor or implement software debayer for bayer sensors.

Thanks, @ShaneCCC. Do you have any information on 1. above as well-- if settings at the current API-level (for example: disabling AWB, AE) can affect latency through Argus? Anything we can do at the 5.x API interface to help lower latency?

Hi @ShaneCCC, just quickly checking in about 1. above-- if settings at the current API-level (for example: disabling AWB, AE) can affect latency through Argus. Thanks.

Suppose that couldn’t improve obviously.

Thanks

@ShaneCCC, sorry it’s not obvious to me… are there other calculations in the ISP pipeline that cannot be disabled via API that require multiple frames?

Sorry to tell don’t have API to disable any feature for performance now.

Thanks

@JerryChang I saw your mention of the recent rawBayerOutput example in this thread. I was wondering if this will bring down latency compared to using all ISP features in Argus? Afaik, the ISP is only used for debayering then.

Thanks!

hello philipp12,

this is incorrect. ISP doing lots of things for transit camera captured bayer contect into yuv420 user-space buffer.

your test results looks identical with our internal evaluation results.
for examale,

3A algorithms and other calculations take up approx. 4-5 frames latency when passing through Argus.

so, there’s currently no room to sink processing latency for using Argus pipeline.
may I know what’s your actual use-case, or, what’s your expectation?

Thanks, @JerryChang!

this is incorrect. ISP doing lots of things for transit camera captured bayer contect into yuv420 user-space buffer.

Please let me give some additional background: We are looking at a 1080p/60fps live streaming use case. Our target for latency from event to user-space buffer is 50ms (mean). That would be 3 frame periods (1 for sensor exposure, 1 for ISP functions, 1 to transfer frame to user-space buffer), roughly speaking. We could reduce ISP functions (setEnableIspStage(false)) if that helps.

  • Could you please share some more information about where in the pipeline additional capture frames may be buffered resulting in additional delay above 3 frame periods (mean)? We are trying to understand the why.
  • From the documentation provided in JetPack 5.0.2 it is currently also not clear what the difference between setEnableIspStage(false) and setPostProcessingEnable(false) is. Would you have additional information here and if setting either of those settings would impact the latency discussed above?

Thanks!

hello philipp12,

this is frequently asked.
you might see-also these discussion threads, i.e. Topic 55327, and Topic 67377.

here’s FYI for internal camera stack…
from Argus side, it takes care of initiating extra 2 captures for sensor exposure init programming.
these 2 capture buffers were ignore at driver level and they’re not sending to user-space.

so,
for statistics checking, the glass-to-glass latency is 5-frame delay.

Thanks, @JerryChang. I reviewed the links you shared. For the latency reported by argus_camera --kpi, is this corresponding to the processing time after the frame has been exposed by the camera, until the frame processing is finished and it is available in user space? In argus_camera.cpp, It is calculated as getSensorTimestamp() - currentTime when the CAPTURE_COMPLETE event is received.

Could you please also comment on my question above what the difference between setEnableIspStage(false) and setPostProcessingEnable(false) is?

Thank you again

there’s Argus samples, rawBayerOutput to demonstrate raw capture using argus with options available to enable/disable 3A/ISP to converge sensor exposure settings.
this settings is controlled by setPostProcessingEnable() API.

Thanks, @JerryChang. Could you please also comment on what precisely is measured by the “latency” output from argus_camera --kpi? We are trying to understand the pipeline, and what exactly this KPI measures. Is this number measuring what happens after sensor exposure is completed, until (ISP) frame processing is finished and the frame is available in user space?

Also, does setPostProcessingEnable(false) have any effect when setEnableIspStage(false)? Or post processing is disabled when Isp stage is disabled, so no need for calling the first?

hello philipp12,

--kpi options will enable PerfTracker API. for example, PerfTracker 1: latency 32 ms average, min 31 max 33
this is an average results. this latency data means the time between software sending capture event till actual camera frames arrival to user-space.

by add this setPostProcessingEnable(faluse), it’ll disable 3A controls within ISP, those camera stack calculated results will not apply to low-level driver.

1 Like