Jetson TX2 recovery error after updating u-boot: probing the target board failed

Hi,
I just tried to update u-boot by installing a u-boot-dtb.bin to bootloader/t186ref/p2771-0000/500/u-boot.bin but after running sudo ./flash.sh -r -k kernel jetson-tx2 mmcblk0p1 I see that U-Boot errors with:

[0003.169] Synchronous Exception: UNKNOWN EXCEPTION

This requires debugging but now flash.sh can’t seem to access the devkit in recovery mode:

$ sudo ./flash.sh  jetson-tx2 mmcblk0p1
###############################################################################
# L4T BSP Information:
# R28 (release), REVISION: 2.0, GCID: 10567845, BOARD: t186ref, EABI: aarch64, 
# DATE: Fri Mar  2 04:57:01 UTC 2018
###############################################################################
Error: probing the target board failed.
       Make sure the target board is connected through 
       micro-B USB port and is in recovery mode.

I can see the devkit in recovery mode using lsusb and in dmesg:

$ lsusb | grep NV
Bus 001 Device 039: ID 0955:7c18 NVidia Corp.

$ dmesg
...
[92184.865059] usb 1-1.3.2: new high-speed USB device number 39 using xhci_hcd
[92184.977936] usb 1-1.3.2: New USB device found, idVendor=0955, idProduct=7c18
[92184.977941] usb 1-1.3.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[92184.977943] usb 1-1.3.2: Product: APX
[92184.977946] usb 1-1.3.2: Manufacturer: NVIDIA Corp.

How can I recover my tx2?

I’m not running the flash.sh tool from a VM and I’ve tried the tx2 module in a different devkit.

Which L4T release is it (R28.2.1 is most recent)? Is there a USB HUB being used between host and Jetson? Is the USB cable the one which comes with the dev kit?

I’ve downloaded a fresh Linux_for_Tegra r28.2 and tried to flash using that but it still can’t find the devkit. I don’t think I have the original cable but the cables I’m using have data lines - I can see the device appear in dmesg and lsusb.

I wonder why the flash.sh tool can’t see it. Maybe the devkit device has appeared in an unexpected place…

So the flash.sh tool uses bootloader/tegrarcm_v2 to read the UID of the device. On my laptop I am unable to read the UID:

$ sudo ./bootloader/tegrarcm_v2 --uid
Failed to read UID

However, the same command works on another person’s laptop:

$ sudo ./bootloader/tegrarcm_v2 --uid
BR_CID: 0x81801001644037c91c00000007020200

However, I am also unable to flash on that laptop - I get the same error as the OP.

R28.2 may have issues (don’t know if repository.json was fixed yet or not), but R28.2.1 would be good.

There are many cables out in the wild which seem to have data lines, but then fail for any serious work. Even if you see existence with “lsusb” chances are rather high that your cable is a problem for actual flash. You may want to invest in another cable known to work well with data and not sold as a charger cable. Be suspicious of any cable not specifically listed as “micro-B” USB…“OTG” in theory would be the same, but in reality isn’t always of sufficient quality (too many people equate “charger” and “OTG” without regard to quality of the data side).

Sometimes if a signal quality issue is present, then removing or adding a USB HUB might help (avoid other devices connected to the HUB if possible).

You could try to clone a partition which simplifies things because it only reads, but the failure to read UID would also cause clone to fail.

Okay thanks linuxdev, I was confusing things, the other laptop was running R28.1. My local R28.1 also returns the correct BR_CID.

I will try some other cables (although this very cable is what flashed the faulty u-boot in the first place).

Edit: found the original USB cable - still seeing the same issue.

There may be an actual hardware problem in this case, though it might be on your laptop and not the Jetson. At this point enough of the possibilities have been narrowed down that the best evidence is from the UID read working with someone else’s laptop, and yet not with yours. Have you tried with and without a USB HUB? Sometimes a HUB can offer better signal quality.

Okay so I’ve got a new USB data cable with a good connection to the board (a solid click). I am able to successfully flash the device using the flash.sh tool included in JetPack 3.2 on Ubuntu 12.04 running on a Thinkpad X220. In addition to this I am able to flash on a Thinkpad X230 using Tegra186_Linux_R28.2.1_aarch64.tbz2 on Debian Stetch 9.4 (current stable).

Using the same cable, and the flash.sh tool included in Tegra186_Linux_R28.2.1_aarch64.tbz2, on Debian Stretch 9.4 running on my Thinkpad X230 I have little success in flashing. Using the command sudo ./flash.sh jetson-tx2 mmcblk0p1, one of the following three things occur:

  1. Most of the time it reports “Error: probing the target board failed.”
  2. Once, I have succeeded flashing with the -r flag: sudo ./flash.sh -r jetson-tx2 mmcblk0p1
  3. Sometimes, flash.sh detects the board and begins flashing, but then hangs at the following point:
$ sudo ./flash.sh jetson-tx2 mmcblk0p1
...
[   0.4510 ] Generating recovery mb1-bct
...
[   0.5334 ] Boot Rom communication
[   0.5345 ] tegrarcm_v2 --chip 0x18 --rcm rcm_list_signed.xml
[   0.5354 ] BootRom is not running
[   1.8307 ] 
[   1.8330 ] tegrarcm_v2 --isapplet

When this happens, I periodically see the following error in dmesg:

[ 8822.096660] INFO: task tegrarcm_v2:12576 blocked for more than 120 seconds.
[ 8822.096669]       Tainted: G           O    4.9.0-6-amd64 #1 Debian 4.9.88-1+deb9u1
[ 8822.096672] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 8822.096676] tegrarcm_v2     D    0 12576  12513 0x20020000
[ 8822.096683]  ffff9c13b464f000 0000000000000000 ffff9c136db28140 ffff9c151e2d8980
[ 8822.096688]  ffff9c1514f87080 ffffa91ac808bc50 ffffffff98c0d6d9 ffffa91ac808bdec
[ 8822.096692]  00ff9c1336401400 ffff9c151e2d8980 0000000000000003 ffff9c136db28140
[ 8822.096697] Call Trace:
[ 8822.096708]  [<ffffffff98c0d6d9>] ? __schedule+0x239/0x6f0
[ 8822.096713]  [<ffffffff98c0dbc2>] ? schedule+0x32/0x80
[ 8822.096717]  [<ffffffff98c10ed7>] ? schedule_timeout+0x167/0x380
[ 8822.096724]  [<ffffffff986e65a0>] ? del_timer_sync+0x50/0x50
[ 8822.096728]  [<ffffffff98c0e729>] ? wait_for_completion_timeout+0xf9/0x130
[ 8822.096734]  [<ffffffff986a3860>] ? wake_up_q+0x70/0x70
[ 8822.096756]  [<ffffffffc0392a4a>] ? usb_start_wait_urb+0x8a/0x170 [usbcore]
[ 8822.096772]  [<ffffffffc039c8b9>] ? proc_bulk+0x229/0x3a0 [usbcore]
[ 8822.096787]  [<ffffffffc039f324>] ? usbdev_do_ioctl+0x7d4/0x1250 [usbcore]
[ 8822.096792]  [<ffffffff9881f2d8>] ? dput+0xc8/0x250
[ 8822.096806]  [<ffffffffc039fdac>] ? usbdev_compat_ioctl+0xc/0x10 [usbcore]
[ 8822.096811]  [<ffffffff98861eff>] ? compat_SyS_ioctl+0xbf/0x1150
[ 8822.096816]  [<ffffffff98603c45>] ? do_int80_syscall_32+0x65/0xc0
[ 8822.096820]  [<ffffffff98c13ee2>] ? entry_INT80_compat+0x72/0xa0

Nothing unusual is happening in udevadm monitor or /var/log/syslog. I get the same results with or without a USB hub (I have removed other devices connected to the USB hub) and on all USB ports.

I believe the problem to be with my laptop/OS but I’m not sure how to debug further without pulling apart the kernel. I will try swapping the laptop body first.

Finally I have tried swapping my x230 with another and the problem persists. It must be something to do with the way my Debian installation recognises the tx2. I’m going to shelve this for now as it’s taking too long and I have a way to re-flash using another laptop.

Any input or ideas welcome. Thanks

Your host PC’s USB might be having autosuspend related issues, e.g., be waking up from autosuspend and not having the power restore working correctly…but I’m just speculating based on the “wake_up_q” (autosuspend is unlikely to actually be part of this issue, it is grasping).

I have seen something vaguely similar on Debian based systems when the firmware for UEFI boot was not correct versus the booted system. On your host you might check if there are any updates, especially UEFI. You might even check if the BIOS of your laptop has any updates available (though if it is UEFI, then by definition it isn’t really called a “BIOS” anymore, but check the manufacturer for any BIOS flash updates).