Xavier suddenly cuts its own power

This has popped up a few times and I can’t seem to find the problem. Its running fine fan is on doesn’t feel hot but running something like gazebo or tensorflow the Xavier will suddenly shut down. Turns itself off. Thought it was the power supply on my bot but does the same thing with the Xavier power supply plugged in as well. Monitored the internal power supply for any dips etc its steady as a rock. So don’t think its a power thing. Doesn’t get overly hot. Seem to be only on things that are using the GPU. Is there a log file somewhere that will log why it shut down?

You could log in with a serial console and see what is going on right at the moment it stops. For example, you could run “dmesg --follow”, and the host with the serial console program will see everything that dmesg says without it disappearing.

Further investigation showed it does work with the original ps so I ordered another buck convertor. Even though I can’t detect a voltage dip with the software for the buck I can’t see it being anything else. That converter is 3 years old and has had some voltage surges pass through it in the past. So hopefully the new one will fix the problem. Its tight in the bot so I surmise I didn’t have the power plug fully inserted when it failed with the standard ps. The buck is 160watts variable voltage 12-24 settable. I run it on 19v. Its smart so I wrote a little program to put out the input and output voltages continuously. Never see a drop but that doesn’t mean its not dropping. I’ve run tk1 tx1 and tx2 on it never had an issue before. Since I use 6S battery I have to lower the voltage. I’ll hook up the console just for grins nothing else to do until the other part gets here. Just a little worried that forcing it to keep shutting down might damage the Xavier.

Nope connecting up the console doesn’t show anything.

Right at the moment of boot (and I mean a very short moment) rails must meet a certain tolerance (sorry, I don’t know what tolerance over what time period, but it is probably far smaller than a ms), and if that doesn’t occur, boot will not start. In the past I’ve seen people put as much as 8000uF on the connector (close to the Jetson) of a supply where no drop can be seen on a scope and get it to work. I realize you won’t have room for large capacitors, but if you have the ability to at least test with this it would be good information to know that it worked or didn’t work.

One workaround I’ve seen people succeed with is a buck/boost converter for one more level of isolation. If you can verify that extraordinary measures to make that very tiny start current not droop as load is applied does or does not work, then you’ll at least have more confidence on the source of the failure.

I have also seen barrel connectors which seem to work, but they are not exact matches, and the system won’t boot off of it…but a perfect connector does work on those same systems. It sounds like this connector has worked for you in the past, so I doubt that is the problem, but it is worth mentioning.

http://www.mini-box.com/DCDC-USB-200 This is what I’m using. A buck boost for car computers. 166watt should be more than enough. Found another thread where this is happening running darknet. Looks like Xavier uses much more than 30watts at nvpmodel -m 0. Several having the issue so doesn’t look good for fixing it with a new converter. Regular robot stack works fine its just when I try and do object id with ZED and stereolabs sdk for tensorflow. Also sometimes Gazebo causes it to shut down. The Xavier is set up to boot on power application and it doesn’t reboot even though power is still there I have to recycle the whole thing to get it to come up again.

I see your particular supply has programmable output to certain voltages. 18V would be the best bet…what is it currently running at?

Are there any USB or other peripherals consuming power from the Jetson at the moment of power on? I’m guessing the ZED draws a lot, it is USB3 and demands the port to itself. Just for testing, does it boot differently when the ZED (and any other significant attached device) is removed?

Tried out a new buck converter today. No joy. It won’t come up without the ZED plugged in so have to have that. I’m running it at 18v. I was running darknet without issues now it won’t work either. Starting to suspect a hardware issue.

The converter I’m using is rated at 160 watts running a 6S 10K 25C lipo should handle it easily. Yea not a happy camper. You build a power system according to the info you are given then find out you might as well junk it as not only is it underpowered but even it it wasn’t shutting down now my runtime calculations are close to half what I wanted. The converter is made for car computers its run just fine on tx1 and tx2. Still no issac SDK after 6 months. As far as I’m concerned they are straight up lying. I don’t know of any rovers running on ac power with the power cube so what are the specs needed to make this thing work in a battery powered system. Wasted a lot of money here.

They may have specified 30W in the marketing material as there are power modes which limit the draw to 30W (or 15 or 10 if desired). That said, I do agree that it would’ve been nice to know the maximum power draw possible under MAX_N mode.

@danpollock For our system we decided to use a 200W regulator to power the Xavier and a few other light devices, which is more than 3x the rated power draw. Do you have anything else running on the converter, or anything else running on the battery? We’ve noticed some regulators will brown out if the input voltage dips too low, which could happen when running e.g. motors too hard.

What battery configuration are you using that works? I’m kinda stuck with what I have. Retired and budget is already blown :)

As it has been with the previous Jetson’s, MAX-N mode is intended for engineering development and not deployment, unless that is something you desire and your application can handle the TDP. For deployment we ship the 10/15/30W modes. You will get the best efficiency out of 15W mode, which is the default mode enabled after flashing JetPack-L4T. You can see the efficiency and power consumption measurements for inference in the DL benchmarks here:

https://developer.nvidia.com/embedded/jetson-agx-xavier-dl-inference-benchmarks
https://devblogs.nvidia.com/nvidia-jetson-agx-xavier-32-teraops-ai-robotics/

BTW, Xavier should consume 1.6W while in 15W mode, and around 3W while idle in MAX-N mode. But if you have run the ~/jetson_clocks.sh script, this will lock the clocks to maximum regardless of processing load, disabling the clock governors and DVFS.

Like the inferencing benchmarks, this was measured using the onboard INA’s. Are you using an external wall power monitor like a Kill-A-Watt?

If you don’t want to reboot for MODE_10W, add this to your /etc/nvpmodel.conf

< POWER_MODEL ID=7 NAME=MODE_10W_NOREBOOT >
CPU_ONLINE CORE_0 1
CPU_ONLINE CORE_1 1
CPU_ONLINE CORE_2 0
CPU_ONLINE CORE_3 0
CPU_ONLINE CORE_4 0
CPU_ONLINE CORE_5 0
CPU_ONLINE CORE_6 0
CPU_ONLINE CORE_7 0
TPC_POWER_GATING TPC_PG_MASK 0
GPU_POWER_CONTROL_ENABLE GPU_PWR_CNTL_EN on
CPU_DENVER_0 MIN_FREQ 1200000
CPU_DENVER_0 MAX_FREQ 1200000
GPU MIN_FREQ 318750000
GPU MAX_FREQ 520000000
GPU_POWER_CONTROL_DISABLE GPU_PWR_CNTL_DIS auto
EMC MAX_FREQ 1065600000
DLA_CORE MAX_FREQ 550000000
DLA_FALCON MAX_FREQ 330000000
PVA_VPS MAX_FREQ 115200000
PVA_CORE MAX_FREQ 115200000
CVNAS MAX_FREQ 601600000

And then use nvpmodel -m 7, it will not require you to reboot. The only change from the original MODE_10W profile was setting TPC_PG_MASK to 0, which is the same as the other modes. Changing this value requires a reboot because it’s a boot-time hardware configuration setting.

If you want 30W and all 8 CPU cores, then we recommend to try mode #3 (MODE_30W_ALL).

Here is a table listing the different modes and their configurations of the cores and clock frequencies:
https://docs.nvidia.com/jetson/l4t/Tegra%20Linux%20Driver%20Package%20Development%20Guide/power_management_jetson_xavier.html#wwpID0E0ML0HA

The Jetson power controller is very fidgety with even small power swings at the input barrel jack.
Even if you see “no drop” at the power supply, and there’s a cable between the power supply and the barrel jack, you may still be in trouble.

Cheap DC DC converters only run at a few hundred kilohertz switching frequency, AND they typically add dampening to their control loops to the order of 5 cycles, to avoid self-oscillating, so it’s totally possible for the Jetson power controller to see a voltage dip when load surges, which you won’t see at the power supply itself.

Similarly, I’ve seen motor spikes on the power bus also confuse the Jetson power controller and have it shut down (even tough the top voltage of the spikes were still below 20V.)

The solution is to add filtering right at power input. A fat capacitor across the barrel jack, and perhaps an inductor to avoid motor spikes if you share a bus between motor and computer. Then, you may run into trouble, because some DC/DC converters don’t like opening up into a high capacitive load, and go into self-protect on power on.

What I ended up doing was put a rail pretty much straight from battery to Jetson (4S LiPo battery) and a separate rail from battery to motor, to reduce the risk of spikes or ground loop induced voltage fluctuations.

You can get a 2 Ah 4S radio controlled car battery and tie it straight to the barrel input for the jetson, if you can arrange for the Jetson to get 15.6 Volts in. “floating” LiPo batteries at their top charge state is unhealthy for the battery; it build up the oxide layer faster and will lead to thermal failure of the battery (either it stops working, or it stops working while generating smoke and heat.)

So, this would probably work better:

Main power -> DC DC converter at 15.6V out -> 4s (“14.8V”) LiPo battery -> Jetson.

Something like this would work:
https://amzn.to/2MUvTs2

Of course, the battery will keep powering the Jetson after the DC converter is off, and LiPo batteries don’t like being drawn down too far (again, oxide layer destroying itself) so remember to unhook the battery when you’re done. (I don’t; instead I built a special battery control/management module with a bunch of power MOSFETs and a small microcontroller with ADC for voltage rail monitoring. But remembering to unhook is cheaper, as long as you don’t forget :)
Never attempt to force-charge a dead LiPo battery. If it’s below 3V per cell (12.0V for 4S) it should be considered dead and carrying the plague, and disposed of to battery recycling.

You’d probably also want:
https://amzn.to/2UJYOld
https://amzn.to/2GrhkuI

If all that still doesn’t work, chances are there’s a driver/kernel bug, or there’s something wrong with your hardware module. But then it should also happen with the original power supply.