I have an object detection model I’ve trained with the Tensorflow object detection API. I converted the model using tf-trt and am running inference on the Xavier devkit on images supplied from a RealSense D435. I’m using ROS to exchange data between nodes, and am visualizing the outputs in Rviz. No other programs are running.
After booting the Xavier for the first time today, I ran the pipeline and after about 30 minutes, the Xavier shut off without warning. I then rebooted the Xavier and reinitialized the pipeline, and it shuts off after a minute or two of running the pipeline. The devkit does not shut down when simply idling.
I’m running the Xavier in MAX_N mode (via
nvpmodel), and have used
jetson_clocks.sh to set the fan to maximum and set the cores to their maximum clock speed. My original thought was that the device was overheating, so I inspected the output of
tegrastats to check temperatures, but all reported temperatures were somewhere between 35-45C, which I believe is safe. The PMIC was reporting 100C, but I have never seen that number change (across multiple Xaviers), so I believe it not be correctly reporting. I also noticed that the Xavier is drawing approximately 39W from the wall (as measured by my meter) when running the aforementioned pipeline.
What are possible reasons for the shutdowns? Could it still be a thermal issue? Are there other issues I should be aware of?