Asking for update on "JP5.1 nvarguscamera doesn’t recover from single NVCSI failure"

Hi,

We’re running into a issue similar to one documented in an old thread in the forums . This thread has been locked with no resolution, so we’re curious if this issue is tracked internally at Nvidia and worked on.

We are encountering this issue intermittently in the field and have a workaround to restart the Argus daemon when we detect it crashing, but we would really prefer if the camera stack above could recover from occasional error.

hello aleks.odorovic,

we did had some camera bug fixes regarding to error recover,
please moving to the latest JP-5 release version, JetPack-5.1.3 for verification.

Hi Jerry,

I can confirm that there are no crashes.

My repro steps are to " $ echo ‘0’ > /sys/kernel/debug/camera0-video/streaming " which does the same register writes as are done when camera streaming is stopped. This puts the camera in standby.

Now, when put in standby mode, Argus will wait for frames and send timeouts indefinitely. That is not fully unreasonable behavior, as we did put camera to sleep under its feet.

This means that still need to have our internal recovery mechanism, as we know that in the field our camera will sometimes keep timing out for long periods of time.

It might be nice if Argus was able to detect that it has been timing out for long periods of time and restart if instructed to in some config.

hello aleks.odorovic,

this is software commands to force stop the camera steam.

FYI, Argus by default expect camera streaming coming continuously without failures.
since there’s timeout failures from camera pipeline. Argus will report it via EVENT_TYPE_ERROR and the application has to shutdown.
Argus should be able to launch the camera stream again after application has restarted.

Thanks for the info. Do we need to actually shut the process down to clean up shared library state, or can we recreate a new Camera session?

hello aleks.odorovic,

you have to restart Argus daemon, and camera application to restore the failure state.
for example,
$ sudo pkill nvargus-daemon
$ sudo systemctl start nvargus-daemon

Yep. We’re doing that. Sorry for missing the reply.

I personally don’t think this as robust setup as it should be, but we’re unblocked. Thanks!

hello aleks.odorovic,

thanks for status update, are we able to close this discussion thread?
please do moving to the latest JP-5 release version, JetPack-5.1.3 for verification.

Yeah, thanks. Will update when we can. The version we use (5.1.2) behave the same way you described, so we’re as good as we can now. Thanks!