Jetson XAVIER AGX watchdog reboot

こんにちは。
私たちが開発したアプリケーションをJETSON XAVIER AGX上で動かしていると、時々次の要因でrebootすることがある。

localhost kernel: [ 0.953601] tegra-pmc c360000.pmc: scratch reg offset dts data not present
localhost kernel: [ 0.953788] tegra-pmc: get_secure_pmc_setting: done secure_pmc=0
localhost kernel: [ 0.953810] tegra-pmc: ### PMC reset source: TEGRA_BCCPLEX_WATCHDOG
localhost kernel: [ 0.953818] tegra-pmc: ### PMC reset level: TEGRA_RESET_LEVEL_L1
localhost kernel: [ 0.953826] tegra-pmc: ### PMC reset status reg: 0x9
localhost kernel: [ 0.953922] tegra-pmc: PMC Prod config success

<質問>
私たちががこの現象を解析・回避するために、次の事を教えてほしい。

  1. このwatchdog timerは、誰によっていつ仕掛けられるのか?
    2. このwatchdog timerは、誰がいつキックするのか?
    3. それは、アプリケーションの影響を受けるのか?
    4. このwatchdog timerを一時的に無効にできるか?
    5.できるとしたら、どうすればよいか?

<動作条件>
ベースモジュール:JETSON XAVIER AGX
キャリアボード:自作
JETPACK Ver.:R32 (release), REVISION: 3.1, GCID: 18186506, BOARD: t186ref, EABI: aarch64, DATE: Tue Dec 10 07:03:07 UTC 2019
アプリケーション概要:
録画
・SDI入力のvideoデータをpcieで接続しているFPGAからxdmaを使って取り込む
・nvv4l2h264enc(gstreamer)でエンコードし、pcieに接続したssdにファイル保存する
再生
・ssd中のファイルデータをnvv4l2decoder(gstreamer)でデコードする
・pcieで接続しているFPGAにxdmaで転送し、videoデータをSDI出力する

よろしくお願いします。

Hello.
When running an application we have developed on JETSON XAVIER AGX, it sometimes reboots due to the following factors.

localhost kernel: [0.953601] tegra-pmc c360000.pmc: scratch reg offset dts data not present
localhost kernel: [0.953788] tegra-pmc: get_secure_pmc_setting: done secure_pmc = 0
localhost kernel: [0.953810] tegra-pmc: ### PMC reset source: TEGRA_BCCPLEX_WATCHDOG
localhost kernel: [0.953818] tegra-pmc: ### PMC reset level: TEGRA_RESET_LEVEL_L1
localhost kernel: [0.953826] tegra-pmc: ### PMC reset status reg: 0x9
localhost kernel: [0.953922] tegra-pmc: PMC Prod config success

In order for us to analyze and avoid this phenomenon, please tell us the following.
  1. Who will set this watchdog timer and when?
  2. Who will kick this watchdog timer and when?
  3. Is it affected by the application?
  4. Can this watchdog timer be temporarily disabled?
  5. If so, what should I do?
Base module: JETSON XAVIER AGX Career board: Self-made JETPACK Ver .: R32 (release), REVISION: 3.1, GCID: 18186506, BOARD: t186ref, EABI: aarch64, DATE: Tue Dec 10 07:03:07 UTC 2019 Application overview: Recording -Import video data of SDI input from FPGA connected by pcie using xdma -Encode with nvv4l2h264enc (gstreamer) and save the file to ssd connected to pcie Playback -Decode the file data in ssd with nvv4l2 decoder (gstreamer) -Transfer with xdma to FPGA connected by pcie and output video data by SDI

Thank you.

Hi,

WDT will be triggered when your CPU hang. Thus, what you should do is not “disabling the WDT”. You should check why the board gets stuck. Disabling WDT does not help this problem.

Please connect the uart log and check if any log gets printed before the sudden reboot happened.

回答ありがとうございます。
このrebootが発生した際には、console logにもsyslogにもwatchdogでrebootしたこと以外のログが何も残っていません。
そこで、このwatchdog timerを止め、不良状態が発生するとどのように動くのかを見てみたいのです。
また、どのような事が、このwatchdog timerの動きを阻害する要因になりえるのか知りたいのです。特にアプリケーション要因で発生するのかどうか。

Thank you for your answer.
When this reboot occurred, there was no log left in the console log or syslog other than the reboot with watchdog.
Therefore, I would like to stop this watchdog timer and see how it works when a defective state occurs.
Also, I would like to know what can be a factor that hinders the movement of this watchdog timer. Whether it occurs especially due to application factors.

Thank you.

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

What console are you using here? If you use console other than uart, then it won’t print.

Share the log to us and we can help confirm.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.