Jetson nano no display lot issues - part II

6 out 7 DOA Jetson Nano dev kits, ordered in last 2 weeks within 3 different countries from separate vendors. This seems related to issue:

Was able to briefly recover 4 of the units a couple hours ago. 3 of the Nano 2GB units and one Nano 4GB using nVidia SDK Manager (NSM), but 2 have died again in the last few hours (a 2GB and a 4GB unit). After running NSM they turned on, displayed the nVidia logo, booted (SD card created separately from ISO). But, after being unplugged for an hour or two failed to display any video on the next power-on.

Attached is a log file of one of the 2GB units. Note that you will need to go to the bottom of the log to see the relevant Nano stuff with some errs, (top of te log includes a previous setup on a separate AGX Xavier unit):

SDKM_logs_JetPack_4.6_(rev.2)_Linux_for_Jetson_Nano_modules_2021-10-22_20-22-53.zip (215.8 KB)


1 Working unit (part of a Geeekpi kit packaged in China):
SN 1423221034440 - Jetson Nano 4GB (B01) sold by Amazon in France.


6 DOA Units (ordered in last two weeks):

SN 1424120075349 - Jetson Nano 2GB shipped by Amazon (sold by waveshare) in France.
SN 1424220035751 - Jetson Nano 2GB shipped by Amazon (sold by waveshare) in France.

SN 1424220017987 - Jetson Nano 2GB sold & shipped by Reichelt in Germany.
SN 1424120075349 - Jetson Nano 4GB (B01) sold & shipped by Reichelt in Germany.

2X - Jetson Nano 2GB sold by Amazon in California.
- Deployed to client do not (yet) have the SN on hand.

You have the same missing file as the post you shared.

This log also tells that.

Existing tbcfile(/home/elman/nvidia/nvidia_sdk/JetPack_4.6_Linux_JETSON_NANO_TARGETS/Linux_for_Tegra/bootloader/nvtboot_cpu.bin) reused.
copying tbcdtbfile(/home/elman/nvidia/nvidia_sdk/JetPack_4.6_Linux_JETSON_NANO_TARGETS/Linux_for_Tegra/kernel/dtb/tegra210-p3448-0003-p3542-0000.dtb)… done.
ERROR xmllint not found! To install - please run: “sudo apt-get install libxml2-utils”
*** ERROR: flashing failed.

I am not sure how you flashed your board with sdkmanager. With this error, it looks like you didn’t flash any of your board before.

Wayne,
thank you for the quick reply!

Questions:

Is it normal/expected to require using the SDK Manager on new boards?

Or only insert an SD card formatted with the ISO, as the docs state:

https://developer.nvidia.com/embedded/learn/get-started-jetson-nano-devkit#setup

In other words are these boards DOA or considered normal aberations?

If it is normal to require using the SDK manager (which seems very odd), is it normal for the board to then start working and then fail to work a couple hours later?

Do you guys care to investigate this situation, eg: you can send me a shipping label (suitable for France) and I send the boards directly to nVidia for inspection?

In other words, do I ship these ones back, order more and pray the new ones work (6 out of 7 DOA is nearly impossible bad odds = manufacturying QC problem is almost certain at this point), or does nVidia have a separate process to deal with this situation?

Respectfully,
Shane Saxon
Saxon Digital

SD card models have QSPI memory on the module. That content is used to boot, and has different versions in some cases such that the SD card content needs to match what the QSPI memory expects. Flashing is how you change QSPI memory. Even if units have the same SD card it doesn’t mean that it can boot unless that QSPI is correct.

eMMC models put that content in partitions.

Hi,

I am not sure where to start. It is a really complicated story so if you have any problem, please ask.

  1. My personal suggestion is flash every new jetson you got with sdkmanager. This is actually the real tool we suggest customer to try. Actually, such sdcard image boot method just starts in jetson for about 1~2 years. Unlike the flash tool method from sdkmanager which has been used for quite long time.

  2. As @linuxdev’s comment here, there are two different devices responsible for doing a full boot. A QSPI memory and your sdcard. After jetpack4.5.1, most of boot components are on QSPI, which means if some software is broken on the QSPI, if you just keep changing from one sdcard to another sdcard, this won’t resolve the error on the QSPI. Under such situation, only sdkmanager/flash.sh can save it because it will reflash both the QSPI and sdcard.

  3. If you understand (1), then there is a potential problem. If you expect a random sdcard image can run on the jetson nano, then QSPI memory needs to be pre-installed with corresponding software. Ideally, this is done by the factory before the board gets shipped out. However, we cannot guarantee the version of QSPI software when you got that board.
    For example, if you bought a nano that came out from factory in 2020, then it will definitely not use the latest bootloader from latest jetpack. If you insert a sdcard image with latest jetpack to such board, it will cause a situation that the bootloader version and kernel version are mismatched. Theoretically, it will work, but we also hear some users reporting the oldest bootloader has some problem with new jetpack.

  4. After jetpack4.5.1, to prevent such mismatched. Every new sdcard will copy the bootloader components to the QSPI if they detect the version is too old. But it will also delete the bootloader components on the sdcard itself. Which means this sdcard is not a complete one to be used on another board.

  5. So a quick way to prevent such problem is directly flash your board with sdkmanager so that the software on sdcard will definitely match the QSPI.

Back to your issue, if you want to find out what happened to your board, maybe you can follow this page to dump the log.

Stuck in boot logo is common error which may have lots of reason to lead this behavior. So checking from a log is a better way.

linuxdev,
i appreciate the reply, note that all of these units are dev kits (nvidia carrier board).

I am curious to see what nvidia thinks is going on, probably need to wait until Monday pacific, when some of the mgr’s get back in the office.

Note that these units do NOT show the nVidia splash-screen on power on and the monitor does NOT detect a video signal. All the working units do show a splash-screen, regardless of whether an SD card is installed or not. So the issue seems unrelated to matching the SD card image/ver to the QSPI version. Also, does not explain the two units that started working after using SDK Manager and then went back to no video output a couple hours later (that no video detected by monitor, NOT just a black screen).

I would also add that if I put in the wrong version of the SD card image on a working unit (eg. 4GB image on a 2GB unit) then it sticks on the nVidia splash screen and does not proceed.

Though you may be right about the QSPI memory being the culprit, perhaps they have a bad lot of QSPI chips. Where the bits degrade after a few hours or weeks (flash memory is notorious for this), this would correspond to the behavior I am seeing and a lot of other posts by others on this forum.

cheers,
Shane

As I already replied, you can check my last comment to dump the log from uart console. So that we can know what is going one here. If this is indeed a hardware defect, then you can file a RMA request.

Also, from your first flash log, looks like you didn’t flash your board with sdkmanager successfully. So I am not sure if this comment is correct or not.

Also, does not explain the two units that started working after using SDK Manager and then went back to no video output a couple hours later

Wayne,
once again, thank you for the quick reply (on Saturday none-the-less!)

If I understand properly, you are suggesting that on the host Ubuntu PC, I run:

$ sudo apt-get install libxml2-utils

And then re-run the SDK manager on the dead boards?

Also, in terms of RMA, I already have RMA and packing slips from all the sellers. But am holding onto the boards for a few days to see if you guys (nvidia) wants to directly find out what is going on here. All of these units were purchased in last two weeks (my units from last June are fine)… and would add from 3 different vendors in 3 different countries, only 1 out of 7 worked out of the box.

Works = generates a video signal and displays nvidia splash-screen (with flashing ‘missing SD card logo’ in upper right when there is no SD card and only power and HDMI cable plugged in).

So is that normal?

If not, I would ask you bring this up on Monday with the team and I will hold onto these units a few days to give you folks a chance to take them directly into possession and figure out what is wrong with this production lot.

In other words, if these things are supposed to generate a video signal out of the factory, then you have a failure rate of 85%.

Perhaps a recall is urgently needed…!?

If you are supposed to run SDK Manager on all newer units, the docs need to clearly state this, so that dev’s can buy the necessary components and setup a Ubuntu PC. Lastly, I will try again to revive the 2 units that are on again then dead again, of course if SDK Manager is the official process and these 2 continue to fail, then your DOA rate is 30%.

cheers,
Shane

Hi,

If I understand properly, you are suggesting that on the host Ubuntu PC, I run:
$ sudo apt-get install libxml2-utils
And then re-run the SDK manager on the dead boards?

Yes,

  1. Try with sdkamanger on the broken board. Share me the sdkm log to confirm if you really flash the board
  2. If you hit this issue even after your reflash sdkmanager, dump the device log from uart console so that we can check what is going on.

If this is your first time doing with sdkm, then maybe the first step would take sometime, based on my past experience with users who firstly use sdkm…

Actually, RMA is only for hardware defect. Your issue so far is not yet confirmed a software or a hardware one. If I confirm all these boards have hardware problem then I would inform related guys for this.

I think it would be better to wait for my confirmation before doing the RMA. Actually, wrong usage will just cause another broken even on a new received module.

Wayne,
sounds good… i plan to get you the serial logs on monday.

(fyi: the only serial transcievers i have on hand are rs485… assuming that jetston needs 5v… so ordered some ftdi uart devices to arrive on monday.)

tomorrow will re-flash the two dead boards (a 2Gb & 4GB), after running the fore-mentioned ‘sudo…’ update and share the sdkm log.

many thanks,
Shane

One other thought to throw in: Dev kits come with two methods of power. The first being via the micro-B USB cable, the second via barrel jack. To switch one needs to change the jumper. As an example, perhaps the jumper is missing on some, but not the others, and the barrel jack would then never show a boot logo for some.

About serial UART: It uses 3.3V logic level. I don’t know if 5V signal would damage the unit (perhaps it would), but signal would be wrong. Certainly the data you’d see on a 5V cable would be wrong. Btw, default speed is 115200 8N1.

Another thought on the topic is that when the SD card models came out I don’t think they knew there was going to be a change in QSPI content. The docs going with the dev kit are probably not going to be designed for needing a different SD card for the different QSPI content, and certainly wouldn’t mention any kind of automatic “migration” of QSPI/SD card version merging.

The 2 dead nanos are alive again, (thank you Wayne!)

As suggested, installed libxml2-utils on the Ubuntu 18.0.4 host PC:

$ sudo apt-get install libxml2-utils

Then re-ran the SDK Manager on both dead boards (2GB & 4GB models), after which they fully booted.

However, there was a glitch upon the second reboot of the 4GB unit (power off and on) where it got stuck on:

[ *** } (2 of 2) A start job is running for End-user…

And alternating between “2 of 2” and “1 of 2” after about 5mins turned off and on again… and it then booted properly, have not been able to repeat the glitch.

So, hopefully all units are good now and will continue to test them more over the next week++.

One thing that I cannot explain, is why the 2 units seemed to work after the initial flash (though with errors as seen in the attached “…fail.zip” log files) and then failed after that.

Personally, I think it is crucial that the nvidia website be updated to reflect that the SDK Manager is required to get the current production boards setup (perhaps not required on some previous units).

cheers,
Shane

ps. Yes, on the 4GB model the power jumper needs to be connected (next to the barrel jack), though the 2GB version doesn’t have a power jumper.

SDKM_logs_JetPack_4.6_(rev.2)_Linux_for_Jetson_Nano_modules_2021-10-22_21-59-02_fail.zip (174.2 KB)
SDKM_logs_JetPack_4.6_(rev.2)_Linux_for_Jetson_Nano_modules_2021-10-22_22-16-30_fail.zip (214.5 KB)
SDKM_logs_JetPack_4.6_(rev.2)_Linux_for_Jetson_Nano_modules_2021-10-24_19-57-06_2GB_success.zip (144.1 KB)

You didn’t actually flash any software at this step, so I cannot explain what was happened there.

One thing that I cannot explain, is why the 2 units seemed to work after the initial flash (though with errors as seen in the attached “…fail.zip” log files) and then failed after that.

Personally, I think it is crucial that the nvidia website be updated to reflect that the SDK Manager is required to get the current production boards setup (perhaps not required on some previous units).

What you are using is not production board. This is sdcard module which is for personal evaluation use only.
Also, what the real thing I want to say is “if you hit error, the solution should be using sdkmanager”. As my previous comment, “flashing everything with sdkmanager” is just my personal suggestion based my experience with jetson issues, this is not a official “must-have” steps for each jetson sdcard module.

Things are going very well.

All units continue to boot and run reliably during our 8-day burnin, with a heavy load (detectnet running at 1080p).

4 out of 5 of the units i ordered in EU (Germany & France) required SDK Manager to become active. The 5th unit (SN number implies was made before the others, as mentioned earlier in the post) posted an nVidia logo out of the box, w/o any SD card, nor any other modifications.

2 units in California worked with the Jetpcak 4.6 ISO (made with Etcher) and did not work out-of-the box with Jetpack 4.5.1… Note that we did no try SDK Manager on them, (though if needed, I suspect this would have made them work with 4.5.1).

So as advised, the take-away is that SDK Manager solves (all of my) non-boot issues. Sometimes is not required, but certainly doesn’t hurt, (perhaps always a good idea).

Thank you Wayne for the affective and prompt support!!

cheers,
Shane

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.