Thor stress testing caused the device to restart

Test method: Using the following command, 7 instances were run

/usr/src/tensorrt/bin/trtexec --loadEngine=./mega_convnet.engine --duration=3600

Phenomenon: After a period of time (within 5 minutes), the entire Thor machine crashes. Sometimes it can be restarted after crashing, and sometimes it cannot restart automatically and requires a power cycle to restart.

Testing found that when running an instance, the device will also restart within 10 minutes, accompanied by an overcurrent alarm

Which JetPack? Any log can can be provided as reference?

The JP version is JP7.0 The device only has overcurrent alarms and no specific software logs. May I ask if long-term overcurrent alarms on the device have any other impacts?

Currently, this is the only print on the device

Is it possible to move forward to JP7.1 to test?

Currently, I also want to know whether long-term overcurrent alarms have any impact on the core board

Generally our system will throttle the board.

The hardware should be no harm but it would be bad for your test case because system is always doing throttling.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.