Im running on a custom carrier board and have ported my BSP to use the new L4T 35.3.1. So far so good on the devkit module p3701-0000 and 32GB p3701-0004, but I am running into a couple of issues with the 64GB modules.
The first issue is that most of the time, pressing the reset button results in MB1 getting stuck in a busy loop.
Plugging in power and powering on using the power button seem not to trigger it but flash.sh will occasionally trigger it and then flashing fails.
busyspin.log (4.5 KB)
The second issue is that once it does boot up, there are a bunch of gpu related errors and the display never comes on. My board is using HDMI with a modified DCB from L4T 35.2.1, and it works fine with the other two modules.
fullboot.log (97.0 KB)
flash.log (90.5 KB)
I did try setting up a user first with
tools/l4t_create_default_user.sh but it didnt make a difference. Just more of the same errors after logging into the serial console.
The only difference in the flashing configs vs the devkit module is the DTB file and BPFDTB and the only change to the DTB is the compatible string to include p3701-0005.
The flash config:
# Module and config Specific Changes
The top level dts:
compatible = "xxx,yyyy+p3701-0005", "nvidia,tegra234", "nvidia,tegra23x";
Sorry for the late response, have you managed to get issue resolved or still need the support? Thanks
I have not managed to resolve the first issue.
I put the module on a developer kit and with my flash config and the standard jetson-agx-orin-devkit config was not able to reproduce the busy spin problem. Has something changed with the power sequencer on the 64GB modules?
The reset circuit on my board is very simple. SYS_RESET_N is connected to a button through a level shifter and no changes occur to the module or carrier board power rails when it is pressed.
The second issue with the GPU I was able to fix. I needed to add
tegra234-p3701-overlay.dtbo to the
OVERLAY_DTB_FILE section of my flash config.
Did you probe the RESET signal to make sure its action is correct? Can you share the schematic of it?
Do you have Orin AGX devkit to test if you see those busyspin log on 64GB modules too?
For the last 8 months I’ve been using the P3701-0000 module from the developer kit and the P3701-0004 32GB modules on several of my carrier boards and haven’t run into this issue, Its something specific to the 64GB module. I did try using one of the P3701-0005 64GB modules on the developer kit carrier board and could not get it to busy spin.
I also tried a second 64GB module to rule out if the first one was faulty.
Here is the schematic for the reset button.
The CTL_1V8 power rail is on when the module VIN_HV and VIN_MV are enabled. and AO_3V3 is on all the time. Most of these signals are unused and go to input pins on my board microcontroller. Only FORCE_RECOVERY, PWR_BTN_N and SYS_RST_N are used.
I will try to grab some scope shots of the reset line and module power rails next week.
Could you put the module back to devkit, boot up and share the boot up log too?
Here is a bootlog of a P3701-0005 64GB module on a P3737 Carrier board flashed with
jetson-agx-orin-devkit config & uefi debug binary.
bootlog_64GB_devkit.log (204.3 KB)
Could you provide the flash log + the uart log during flash for this 64GB module from both devkit and your custom board?
It means I shall see 4 logs in total.
- Ive since removed that and let
p3701.conf.common handle it.
- Correct, It only happens with the 64GB module on my carrier. I cannot get it to happen with the 64GB module and dev kit carrier. It is reproducible on my carrier board no matter if the 64GB module is flashed with my config or the developer kit config.
- here are the logs you requested.
From my carrier board:
flash-64G-810882-2.log (474.4 KB)
flash_serial_log64GB_mycarrier.log (192.1 KB)
From Developer kit:
flash-64G-devkit.log (475.8 KB)
flash_serial_log64GB_devkit.log (224.9 KB)
I got the scope screenshots and everything looks OK. The 64GB module and devkit module timing is identical.
The power management micro is set to boot as soon as it gets power.
Power on sequence:
Pressing the reset button:
Is this a intermittent issue? I mean your custom board right now looks ok and able to boot into device again.
Please be aware that I am only talking about busyspin. I don’t want to discuss about some display issue here. Only the busyspin one.
Display issue has been resolved. No need to chase that one further.
The busy spin issue is very consistent. If I apply power the system will always boot up or flash fine. Once the flash completes, it always successfully boots into the OS.
Powering on and allowing the system to boot is also always successful.
Approximately 7 out of 8 times, pressing the reset button results in a busy spin. On the occasions it does boot up, it seems to be OK.
Now here is the weird part.
If I hold down the recovery button before applying power and flashing, it is always successful.
If I turn it on, hold the recovery button and then press reset, the flash process will start, but then fail with a busy spin. Its able to get through the initial stages of identifying the module, but when it resets to program the QSPI, I think that is when it busy spins.
I will get the flash and serial logs tomorrow of this behavior.
This still all only applies to the 64GB module. Both the dev kit and 32GB modules have worked fine.
Let me go straight about what to check here. Please just check compare below values when the busyspin happened and not happened.
For example, in the latest “boot up fine” log you just shared. Below info is same on your board and devkit.
[0000.302] I> Binary block-0 (partition size: 0x40000)
[0000.363] I> RAM_CODE 0x4000031
[0102.577] I> Binary block-0 (partition size: 0x40000)
[0102.822] I> RAM_CODE 0x4000031
But your previous busyspin error shows
[0000.305] I> Binary block-0 (partition size: 0x40000)
[0000.366] I> RAM_CODE 0x4000471
I noticed that. It does seem to do that every time it busy spins.
[0000.451] E> BL_CARVEOUT: Failed to allocate memory of size 0x36800000 for CO:44.
[0000.458] C> Task 0x0 failed (err: 0x49490003)
[0000.463] E> Top caller module: BL_CARVEOUT, error module: BL_CARVEOUT, reason: 0x03, aux_info: 0x00
Yes, please check if that value goes wrong in every busyspin. Waiting for your update tomorrow. Thanks.
I have now done two tests.
These logs are from flashing after a reset. Predictably, it fails with a busy spin:
flash_serial_log64GB_mycarrier_reset_busyspin.log (9.7 KB)
flash-64G-810882-reset.log (72.9 KB)
Interestingly, the RAM_CODE is different. On flash its 0x4000071
The second test I have done is to log what happens when the board is continuously reset.
repeated-reset.log (136.2 KB)
What I found grepping through it, is that on the occasions that MEM-BCT-1 was used I would get RAM_CODE 0x4000471 and it would busy spin. On the less common occasion that MEM-BCT-0 was used, I would get RAM CODE 0x4000431 and it boots successfully.
[0000.297] I> RAM_CODE 0x4000471
[0000.310] I> get_binary_info: Binary name: MEM-BCT-1
[0000.327] I> BCH of MEM-BCT-1 read from storage
[0000.335] I> MEM-BCT-1 header integrity check is success
[0000.350] I> MEM-BCT-1 binary is read from storage
[0000.355] I> MEM-BCT-1 binary integrity check is success
[0000.360] I> Binary MEM-BCT-1 loaded successfully at 0x40040000 (0xe580)
[0000.366] I> RAM_CODE 0x4000471
[0000.372] I> RAM_CODE 0x4000471
[0000.476] I> Busy Spin
[0000.297] I> RAM_CODE 0x4000431
[0000.310] I> get_binary_info: Binary name: MEM-BCT-0
[0000.327] I> BCH of MEM-BCT-0 read from storage
[0000.335] I> MEM-BCT-0 header integrity check is success
[0000.350] I> MEM-BCT-0 binary is read from storage
[0000.355] I> MEM-BCT-0 binary integrity check is success
[0000.360] I> Binary MEM-BCT-0 loaded successfully at 0x40040000 (0xe580)
[0000.366] I> RAM_CODE 0x4000431
[0000.372] I> RAM_CODE 0x4000431
... boot continues
Hi, did you check the pin design of pin J58 and K58 which are strap pins of ram code? As said, for the unbuffered strap pins, the carrier board must ensure these pins are not pulled or driven low or high during power-on to avoid affecting the strap levels.
You might be on to something. I have UART 5 TX (J58) and RTS (K58) connected to a TXB0108 level shifter. UARTB in this case is UART5.
I will try lifting those pins on the IC and see if it fixes the issue.
That was it!
What is the value of the internal pullup/pulldowns on these pins? I want to avoid a board spin if I can by selecting a level shifter with a higher output hold impedance.