Differentiate between HW Revs of Xaviers

Hi,

Want to clarify this first

  1. Could you share the serial number of the SOM?

  2. Could you share us the boot log from both jp4.5.1 (NG case) and jp4.6.x (Good case)

  3. Is your test based on Xavier AGX Devkit?

  4. Are you sure all the modules from the same batch having this problem? Last time another user thought all his modules have same problem but actually one 1~2 modules have that case.

Thank you for the reply.

  1. Could you share the serial number of the SOM?

Serial number for SOM that can program in 4.6.3: 1560121408681
Serial number for SOM that can program in 4.5.1: 1560121408682

I can get more serial numbers for other SOMs that fail in 4.5.1 tomorrow if need be.

  1. Could you share us the boot log from both jp4.5.1 (NG case) and jp4.6.x (Good case)

I will get this to you tomorrow.

  1. Is your test based on Xavier AGX Devkit?

No - both cases are on a custom carrier board.

  1. Are you sure all the modules from the same batch having this problem? Last time another user thought all his modules have same problem but actually one 1~2 modules have that case.

We did a batch of 20 boards for this build. Of those 20 boards, 5 SOMs had this problem. We’ve confirmed that the boards are working by swapping the SOM with a known “flash-able” SOM and re-running the flashing process in 4.5.1 with success.

I have only tested 4.6.3 on one of those 5 boards at this point. I plan to program more of them tomorrow.

Hi,

Could you also move every test to devkit? It is our SOP here.

I redid the analysis that showed that this was a module issue and found that isn’t as cut and dry as just a module issue. I can cause the issue to happen on both a particular carrier custom board and not others but also on the devkit. What is a little odd is that we have seen pre-programmed modules run fine on the carrier boards that won’t flash from 4.5.1.

The carrier boards that won’t program with 4.5.1 will still program with 4.6.3 without issue.

With the same module I was able to:

  1. Flash Successfully with 4.5.1 on Carrier Board #1
  2. Fail to Flash with 4.5.1 on Carrier Board #2
  3. Flash Successfully with 4.6.3 on Carrier Board #2
  4. Fail to Flash with 4.5.1 on the Devkit
  5. Fail to Flash with 4.6.3 on the DevKit

I would expect that 4.5.1 and 4.6.3 should both work with the module on the devkit. I also tested another Xavier module and got the same results on the Devkit.

So it seems like either:

  1. There is a hardware issue with the module
  2. Or there is a software setup issue with my jetpack installation for the settings that we have changed.

I’m going to try and get the devkit working again and then perhaps I will have more information.

Addendum:

Here is the flash.sh result from a successful 4.5.1 Run:

Here is the flash.sh result from a failed 4.5.1 Run:

Here is the flash.sh result from the successful 4.6.3 Run on the carrier board that previously failed:

Here is a FAILED flash.sh for 4.5.1 on the DEVKIT:

This also fails in a similar way in 4.6.3 on the DEVKIT

The consistent behavior in all fail cases is:

[  10.7491 ] Sending BCTs
[  10.7502 ] tegrarcm_v2 --download bct_bootrom br_bct_BR.bct --download bct_mb1 mb1_bct_MB1_sigheader.bct.encrypt --download bct_mem mem_rcm_sigheader.bct.encrypt

Sorry that I am little confused. You told me it has flash failure on devkit. \

But what is that board here? Looks like you still have some customization.
Could you just download fresh BSP and let sdkmanager to flash it?

Name: AFT-00559-E, Board Family: t186ref, SoC: Tegra 194,

Also, please provide the UART log during flash for each case too.

But what is that board here? Looks like you still have some customization.
Could you just download fresh BSP and let sdkmanager to flash it?

Sorry - I must have copied the wrong thing. There are a lot of balls in the air. I tried both flashing the xavier on the devkit with our custom config and with the base config for the devkit. Both failed.

I attempted to do a start from scratch build today. I copied my existing 4.5.1 jetpack to another location. I ran the sdkmanager and installed a new xavier 4.5.1 Jetpack. I then ran:

sudo ./flash.sh jetson-xavier mmcblk0p1

This also fails. See this log:

Uart Log:

The error generated when running on the devkit is from the same command (tegrarcm_v2) but the error code is different - this generates Return value 8 instead of Return value 2

The UART spills errors:

[0130.389] E> FAILED: Thermal config
[0130.396] E> FAILED: MEMIO rail config
[0130.411] E> Task 50 failed (err: 0x7979061c)
[0130.415] E> Top caller module: ALIASCHECKER, error module: ALIASCHECKER, reason: 0x1c, aux_info: 0x06
[0130.424] I> MB1(1.5.1.6-t194-41334769-1740dd39) BIT boot status dump :
000000000001111111111000011111111111111001100011111000000000000000000000000000000000000000000000000000000000000000000000000000001100100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000          

So that’s interesting. I’m going to try grabbing the UART log from our carrier board (successful and unsuccessful) and see how it compares.

OK Here is the fail log from our custom carrier Jetpack 4.5.1. This is basically the same as what I’ve posted earlier.

Here is the UART log for this run:

This is with the same module as the failure log I posted for the devkit, but there is a different failure mode:

[0032.830] I> Boot-device: eMMC
[0032.833] I> bct_mb1 image downloaded
[0032.843] E> NONE: Invalid value MemBct dram size: 0MB for slot: 3.
[0032.849] C> OEM authentication of MEM-BCT failed!!!
[0032.853] E> NV3P_SERVER: Failed to verify image bct_mem.

This almost seems like the signing of the mem-bct is failing. We haven’t made any changes to this (wouldn’t know where to start). All of our settings changes are in the rootfs and in the kernel.

Any ideas ?

As requested, I posted the flash log and uart log of the Xavier DEVKIT in this reply:

I want to make sure it didn’t get lost in my subsequent reply. Any thoughts on why the Xavier module would fail to flash in this way on the Devkit?

Could you provide me this information?

  1. The flash result of jp4.6.3 on devkit (host + uart log)

  2. The flash result of jp5.0.2 on devkit (host + uart log)

Also… just try not to put more variables here to make confusion. The “variable” means modules/carriers and jetpack release version.

I think it would be better all using devkit to test first. Check if any of them can get flashed and boot for now.

Honestly, we would check your custom board case until we test all devkit cases.

Xavier Devkit 4.6.3 Fail to Flash - flash.sh log:

Xavier Devkit 4.6.3 Fail to Flash - uart log:

Xavier Devkit 5.0.2 Fail to Flash - Flash.sh log:

Xavier Devkit 5.0.2 Fail to Flash - Uart log

Hi,

Thanks for sharing.

You said " 1. Flash Successfully with 4.5.1 on Carrier Board #1". and “3. Flash Successfully with 4.6.3 on Carrier Board #2”.

Could you share me the uart log of these two cases? Also, make sure using same modules as what you just failed on devkit.

Thanks.

I reran the test again to confirm that the behavior has not changed:

Successful Flash on Carrier Board #1 with SAME Xavier Module on 4.5.1

Flash log:

Uart log:

Successful Flash on Carrier Board #2 with SAME Xavier Module on 4.6.3

Flash Log:

Uart Log:

I also re-ran the test for this carrier board 2, same module, but with version 4.5.1. This still continues to fail.

Hi,

Could you help me check if the ramcode here keeps showing mismatch in successful case and NG case even for same module? Please compare them on same jetpack version.

[0000.082] I> Ram repair fuse : 0x0
[0000.085] I> Ram Code : 0x3

also, the exact S/N of this module. Is it 1560121408681 or 1560121408682?

SN of Module is : 1560121408681

4.5.1 - Xavier - Custom Carrier - Failure - Ram Code =

[0031.729] I> ATE fuse revision : 0x200
[0031.733] I> Ram repair fuse : 0x0
[0031.736] I> Ram Code : 0x3
[0031.738] I> rst_source : 0xb

4.5.1 Xavier - Devkit - Failure - Ram Code =

[0017.498] I> ATE fuse revision : 0x200
[0017.502] I> Ram repair fuse : 0x0
[0017.505] I> Ram Code : 0x2
[0017.507] I> rst_source : 0x0

4.6.3 - Xavier - Custom Carrier - Success - Ram Code =

[0022.233] I> ATE fuse revision : 0x200
[0022.236] I> Ram repair fuse : 0x0
[0022.239] I> Ram Code : 0x3
[0022.242] I> rst_source : 0xb

4.6.3 - Xavier - Devkit - Failure - Ram Code =

[0021.561] I> ATE fuse revision : 0x200
[0021.565] I> Ram repair fuse : 0x0
[0021.568] I> Ram Code : 0x2
[0021.570] I> rst_source : 0xb

What sets the RAM code in the Jetson ? Is it possible that one of the pins on the carrier board is pulled to a different strap configuration causing a different RAM code ?

What is the RAM code supposed to be for the Xavier AGX?

We will use your serial number to check with our factory and see what ramcode it supposed to be.

Even they are all Xavier agx, the ramcode may be different due to PCN update.

@carlo2q4g are these new modules you get or it has been used for a while?

Could you take a picture of the sticker on the module and share it here?

@carlo2q4g

for your jetpack4.5.1, please apply the overlay tarball here.

Overlay to support PCN208560, Jetson AGX Xavier 32GB
https://developer.nvidia.com/overlay-3251-agx-sku4tbz2

are these new modules you get or it has been used for a while?

We bought them from Arrow and they have been kept in dry storage. They have not been used in production yet.

Could you take a picture of the sticker on the module and share it here?

for your jetpack4.5.1, please apply the overlay tarball here.

OK - I will give that a try tomorrow.

Thank you!

I was able to flash correctly with 4.5.1 for one of the modules that was not flashing correctly previously. We will start applying the same process to others. I assume they will also work.

Thank you!

1 Like

I’m closing this topic due to there is no update from you for a period, assuming this issue was resolved.
If still need the support, please open a new topic. Thanks

Hi @carlo2q4g

Also want to double confirm, are you sure “rel-32.7.3” cannot flash these modules on devkit?