Jetson NX reboots if camera chip is paused for more then 1-2 minutes [follow]

Greeings!

This is a followup of a previous topic: Jetson NX reboots if camera chip is paused for more then 1-2 minutes
It closed before I could gather the new information needed. Sorry for the inconvenience!

Based on the previous questions we’ve shecked the temperature warnings.

We do not believe that there is a thermal problem, though the Jetsons frequently reports throttling. In test with higher frame rates and GPU loads, the setup runs for days. Speculatively, there are a few possible causes for the reboot:

  1. When we use a very low frame rate (less than 1 frame every few minutes), the board resets after a couple of minutes. Is it possible that firmware uses camera thermal sensor for something and that lack of this data causes a timeout?
  2. The MIPI controller/interface has a low-level timeout configured?
  3. Something tickles an I2C error and this causes reboot (as we have seen if the I2C get’s into an unrecoverable lock state).
    Can someone comment on the above or suggest how we can get to the root of this, please?

László

hello vargalg,

camera pipeline by default have 2500ms for waiting frames, it reports timeout if camera frames are not coming.

may I know what’s your actual use-case,
you may try below command before launch camera to enable infinite timeout.
for example,
$ sudo service nvargus-daemon stop
$ sudo enableCamInfiniteTimeout=1 nvargus-daemon

Thank you for the reply,

We already use the enableCamInfiniteTimeout=1 That did allow us to have wait a few minutes.
But still after 3-5 minutes (it looks like this time is varying depending on the board) the device just restarts. No specific argus error message, nothing in the system logs around that time, so no indication of what might be the problem.

A previous advice suggested that it might be an overheting problem, so we checked the temperatures but no overheating occured.
The IMX 477 sensor that we use does have a thermal sensor, so one of our geuesses is that the thermal data is arriving embedded with the image data. So if we stop the camera feed, then the system assumes that there is some thermal/overheting problem and reboots the system. But is still a guess.
We also think it might be a low (firmware level) behaveour as there is nothing in the system logs before the restart.

Is this still an issue to support? Any result can be shared? Thanks

Yes, we are still working on the problem.

Our best solution so far is a hack, triggering the camera on a timeout if there is no outside trigger for 2 minutes. But that’s distrupting the synchronization. (There are frames that client did not ask for, and the client cannot trigger until the extra frame is processed.)