Various problems with TX1 Developer board

Hello,

I have had access to a TX1 Developer Kit for about 6 months now, but today I noticed a few issues when trying to get it to work that I did not encounter before:

  • Pressing the power button does not start the boot process, instead it takes 45 seconds to 1 minute before booting actually starts. I don't think it's a networking issue contrary to what I have read, this would happen whether or not I have connected the Jetson to a USB device or to Ethernet. Rebooting does not have this delay
  • When booting up after plugging power, the two green LEDs by the buttons do not light up.
  • The USB 3.0 port does not detect devices or supply current to anything attached so I can't use a mouse or keyboard with the kit. An SD card cannot be detected either.

The last time I used the developer kit was 3 months ago (it was in a high school during summer break) and that was to flash JetPack 3.2. I was able to flash JetPack 3.3 today but in order to do that I had to access the serial console to reboot the TX1 to recovery mode, and the issues did not go away.

I’m not sure what the problem is between the module and the development board. It’s possible the board got damaged but the kit wasn’t accessible in the 3 month period so that sounds doubtful. I am using the stock AC adapter that came with the kit. I can provide a dmesg log later today if that helps. Otherwise, should I consider an RMA?

Here’s what lsusb outputs:

Bus 002 Device 002: ID 0955:09ff NVidia Corp.
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

And here’s what dmesg outputs: I notice some errors with “tegra21x_xusb_firmware”. But if the root hubs can be found then it sounds more likely the board has malfunctioned, and using a different carrier board may work?

EDIT: I uploaded the file so it doesn’t get truncated as it probably did previously
dmesg.log (65.6 KB)

Depending on how networking is set up, it can be that instead of continuing on when network is not up it waits for a timeout. So I wouldn’t ignore the possibility of network issues if you don’t see what the serial console says during boot even if the software setup is 100% correct (the DHCP server or router might not answer and then a timeout might occur). However, you are right that it could be something else…often a different timeout which wasn’t timing out previously.

Your logs show a boot time after the boot loader of only 8 seconds. I assume that at the point it truncates this is where things go wrong? Can you verify?

FYI, a serial console can show some output a regular console misses. Serial consoles have no vulnerability to most drivers which video can fall to. In your case it might not change anything, but I still recommend making sure the log didn’t miss anything via a serial console (though I don’t know if perhaps this is already logged from a serial console). See:
http://www.jetsonhacks.com/2015/12/01/serial-console-nvidia-jetson-tx1

Assuming the delay starts where this log truncates it is quite possible something connected to USB is involved. What is connected to the USB? Can you see what happens to boot time if nothing is connected to USB? Has anything on USB changed since it was last known to work correctly?

FYI, the internal network controller on a TX1 is wired through a USB HUB. So USB and wired ethernet can affect each other.

Oops, when I tried to paste the entire log to my post, it seem to have been truncated at some point, probably because I hit the post size limit. I uploaded it as a file now.

To clarify, the issue I’m having is that is takes around 45 seconds from the time I press the power button to when I saw the boot logo on my monitor (and the green LEDs don’t light up, they have when I first used it.) Otherwise there’s no problems AFAIK other than the bullet points, the TX1 can boot successfully and that takes 20 seconds which I expect. I did see some [FAILED] outputs while it was booting, I should record a video to figure out what it could be.

I did figure out how to access the serial console, but the connection drops every time data is received while the TX1 is booting due to a framing error as reported by CoolTerm on Windows. When the GUI shows up, it doesn’t disconnect unless I try to run a command that outputs a lot of data, that could be what is happening during boot. I’m using an Adafruit FTDI Friend to connect to the console, and I connect to that with 115200 baud configured, no hardware flow control.

FYI, the internal network controller on a TX1 is wired through a USB HUB. So USB and wired ethernet can affect each other.

That’s interesting, but I’m not sure how that affects me. The USB 3.0 port doesn’t currently provide current whether or not the Ethernet cable is plugged in. The ethernet port lights don’t up which is strange but otherwise I haven’t had networking issues yet. I couldn’t find any correlation between the long delay to booting and whether something was connected to the port or not. When the port worked I would connect it with a USB 3.0 port hub.

EDIT: I got the serial console to work in Ubuntu 16.04 with minicom. I opened the console and then booted the Jetson, but I only got garbage when I was about to start booting. I don’t think I got anything different from what I could get by saving the dmesg output to a file.

The regular log (via dmesg) only begins once the kernel has loaded. To see what occurs earlier (during boot loader stages) you’d have to have a serial console (URL given above if you want to try serial console). I’m not sure if anything would show up from early CBoot stage via serial console, but there is no way from the current log to know about those earlier moments in boot.

Just for information, your log stops at about 15 seconds (timestamp, not actual time…this is the time since kernel load), and I have an R28.2 TX1 log which does a lot in a similar way (but not exactly) up through that time. Actual logging and time to reach a login prompt for me continues on after a few seconds of pause (yours stops at that point)…mine continues there with “xhci-tegra 70090000.xusb” messages. The only thing I see from this which might be a clue is this:

[    5.274326] xhci-tegra 70090000.xusb: can't get usb3-0 phy (-517)

The PHY error could be a hardware error at the USB connector, but I can’t say since mine had USB connected and perhaps yours does not (or perhaps a different USB speed on that connector would change the message…I don’t know).

On my R28.2 TX1 the logo shows up at about 10 seconds. 45 seconds does seem to be an oddly long delay. If you want to see what the difference is you’ll need to use serial console and log that. FYI, serial console has a time stamp during bootloader stage which could be compared with a known working system (the timer restarts upon kernel load, but a dmesg log won’t show anything prior to kernel load). It might be there is something wrong, or it might be working correctly.

Okay, I watched the video in the Jetson Hacks article and did noticed that the CBoot log was appearing. However when I booted my Jetson kit, I couldn’t replicate the result. Here’s a screenshot (see upload) of what happens when I reboot the Jetson via the command-line, I get one line from the CBoot stage and a bit of garbage data (then the kernel log appears for a fraction of a second, more garbage, then nothing until 9 seconds.) I don’t think there was any output to the console during the 45 second delay, I would know due to a green light on the Adafruit FTDI Friend.

I obtained the dmesg logs by running “dmesg > out” after the GUI appears, and unless I wait for about 5 minutes it does stop after about 15 seconds. I didn’t know there were more messages after the IPVS line. Also the dmesg output was the same whether or not I had an USB device plugged in (the device wasn’t detected.) Is it possible there are issues with both the board and the TX1 module?

That was from shutdown. Do you have a serial console log from boot start? “dmesg” will never be able to show early boot information…if it is shutdown you are looking at, then it might be ok, CBoot is entirely during early start-up (even before U-Boot).

Btw, there are some differences in boot between the different releases. I don’t know which L4T was used in the video you’re looking at, but it is possible it is from some other release.

The green LEDs should go on instantly when the power button is pressed (assuming it has power available). The monitor’s behavior though should be quite delayed compared to that logo (45 seconds to a minute to see the monitor turn on is expected). Are the LEDs delayed in turning on? If so, then this is probably a hardware failure.

Here’s a screenshot of the console when the Jetson boots up, I just get garbage data and part of the dmesg output, but nothing that’s seems to be from CBoot or U-Boot.

Also, the Green LEDs don’t light up at all. (The top green LED lights up if I connect the TX pin with TX on my serial device and RX with RX but then serial won’t work of course and I doubt that’s intentional).

It isn’t possible for dmesg to show anything from CBoot or U-Boot…those processes are long dead by the time the Linux kernel loads. There can be some odd bytes right at the very start of serial console, but this isn’t necessarily a bug.

One thing to consider is that if you don’t know for certain that minicom is set up correctly, then there might be stray bytes sent which are for a modem init string (such is the history of minicom, it was designed for modems). I’ll suggest you install gtkterm and run it like this (I am assuming you know your host’s serial console is “/dev/ttyUSB0”…adjust for your case…“sudo” is required if your user is not a member of the dialup group):

gtkterm -p /dev/ttyUSB0 -s 115200 -b 8 -t 1

(115200 8N1 without flow control)

The lack of the LED tends to be fairly strong evidence something went bad on the carrier board, but you should probably check the gtkterm serial console before using that conclusion. I do not believe your software could cause the LED to stop showing, nor for it to delay as long as you’ve noticed. The green LEDs should go on almost instantly and not have any noticeable delay.

Yeah I agree the green LEDs should be connected to some electrical signal and probably isn’t turned on by any software. I don’t have any software on the Jetson other than reflashing L4T 28.2 and updating packages.

I used gtkterm and there’s a notable difference in the output. The first photo shows what I get a few seconds before the boot logo comes up on a monitor, there’s a lot of bytes received, but still not discernible. The second photo shows my setup, and you could see the green LEDs aren’t on.

Thanks for your help by the way!

Gtkterm allows you to save to a log file. You might use “File->clear screen”, “Log->to file”, and only then begin the boot process. The file selection mechanism is a bit of a pain at times so I sometimes “touch <my_preferred_file_name.txt>” and then log to the touched file. You’l be able to capture 100% of the output without effort.

There are a few small places where serial console might see non-printing characters, but what you’re seeing is entirely too much. I believe you’ll need to RMA. Between the LED issue and this non-printable random text I don’t see this as fixable through software. You could try to flash again, but I don’t think it is going to help.

If you search for RMA near the top of this it’ll list instructions:
https://devtalk.nvidia.com/default/topic/793798/embedded-systems/some-jetson-web-links/

Keep in mind that if you get a new board you might find the actual module is still good…indications are that it is probably the carrier board which is failing.

I accepted your answer right away but it took me a few days to borrow another developer board from another school’s robotics team. We get them for free through FIRST Choice and I think because of that we can’t refund the Jetson kits as those are donated.

The good news is I can confirm the module is indeed working with the other board (which has none of the issues I have brought up), which confirms that only the developer board has a hardware failure. Well that’s great, we only have to spend money on a new carrier board - and we need one anyway that is small and takes up minimal surface area on a robot.

Donated boards do probably still have a warranty period, but I don’t know…perhaps it would need to be replaced by the person doing the donating. You would have to ask whoever donated it.

Carrier boards can be expensive, but the thing which usually causes confusion is that you’ll need to be sure to get the board support package from that carrier vendor. Mostly this will install a different device tree (wiring layout will differ on the alternate carrier board). If you were to just move the module from a dev carrier to a new carrier it wouldn’t work until the new BSP is installed.