Im running on a custom carrier board and have ported my BSP to use the new L4T 35.3.1. So far so good on the devkit module p3701-0000 and 32GB p3701-0004, but I am running into a couple of issues with the 64GB modules.
The first issue is that most of the time, pressing the reset button results in MB1 getting stuck in a busy loop.
Plugging in power and powering on using the power button seem not to trigger it but flash.sh will occasionally trigger it and then flashing fails. busyspin.log (4.5 KB)
The second issue is that once it does boot up, there are a bunch of gpu related errors and the display never comes on. My board is using HDMI with a modified DCB from L4T 35.2.1, and it works fine with the other two modules. fullboot.log (97.0 KB) flash.log (90.5 KB)
I did try setting up a user first with tools/l4t_create_default_user.sh but it didnt make a difference. Just more of the same errors after logging into the serial console.
The only difference in the flashing configs vs the devkit module is the DTB file and BPFDTB and the only change to the DTB is the compatible string to include p3701-0005.
The flash config:
source "${LDK_DIR}/p3701-my-board.conf.common";
# Module and config Specific Changes
BPFDTB_FILE="tegra234-bpmp-3701-0005-3737-0000.dtb";
DTB_FILE="tegra234-p3701-0005-my-board.dtb";
TBCDTB_FILE="tegra234-p3701-0005-my-board.dtb";
I have not managed to resolve the first issue.
I put the module on a developer kit and with my flash config and the standard jetson-agx-orin-devkit config was not able to reproduce the busy spin problem. Has something changed with the power sequencer on the 64GB modules?
The reset circuit on my board is very simple. SYS_RESET_N is connected to a button through a level shifter and no changes occur to the module or carrier board power rails when it is pressed.
The second issue with the GPU I was able to fix. I needed to add tegra234-p3701-overlay.dtbo to the OVERLAY_DTB_FILE section of my flash config.
For the last 8 months I’ve been using the P3701-0000 module from the developer kit and the P3701-0004 32GB modules on several of my carrier boards and haven’t run into this issue, Its something specific to the 64GB module. I did try using one of the P3701-0005 64GB modules on the developer kit carrier board and could not get it to busy spin.
I also tried a second 64GB module to rule out if the first one was faulty.
Here is the schematic for the reset button.
The Module:
Level Shifters:
The CTL_1V8 power rail is on when the module VIN_HV and VIN_MV are enabled. and AO_3V3 is on all the time. Most of these signals are unused and go to input pins on my board microcontroller. Only FORCE_RECOVERY, PWR_BTN_N and SYS_RST_N are used.
Sure,
Here is a bootlog of a P3701-0005 64GB module on a P3737 Carrier board flashed with jetson-agx-orin-devkit config & uefi debug binary. bootlog_64GB_devkit.log (204.3 KB)
could you clarify why do you need to change bpmp dtb by yourself? I mean this shall be handled by our script. You don’t need to do anything.
I only care about the busyspin issue right now. Could you confirm the busyspin issue only happened on 64GB module on your custom board but won’t happen on devkit?
Ive since removed that and let p3701.conf.common handle it.
Correct, It only happens with the 64GB module on my carrier. I cannot get it to happen with the 64GB module and dev kit carrier. It is reproducible on my carrier board no matter if the 64GB module is flashed with my config or the developer kit config.
I got the scope screenshots and everything looks OK. The 64GB module and devkit module timing is identical.
The power management micro is set to boot as soon as it gets power.
Power on sequence:
Display issue has been resolved. No need to chase that one further.
The busy spin issue is very consistent. If I apply power the system will always boot up or flash fine. Once the flash completes, it always successfully boots into the OS.
Powering on and allowing the system to boot is also always successful.
Approximately 7 out of 8 times, pressing the reset button results in a busy spin. On the occasions it does boot up, it seems to be OK.
Now here is the weird part.
If I hold down the recovery button before applying power and flashing, it is always successful.
If I turn it on, hold the recovery button and then press reset, the flash process will start, but then fail with a busy spin. Its able to get through the initial stages of identifying the module, but when it resets to program the QSPI, I think that is when it busy spins.
I will get the flash and serial logs tomorrow of this behavior.
This still all only applies to the 64GB module. Both the dev kit and 32GB modules have worked fine.
I have now done two tests.
These logs are from flashing after a reset. Predictably, it fails with a busy spin: flash_serial_log64GB_mycarrier_reset_busyspin.log (9.7 KB) flash-64G-810882-reset.log (72.9 KB)
Interestingly, the RAM_CODE is different. On flash its 0x4000071
The second test I have done is to log what happens when the board is continuously reset. repeated-reset.log (136.2 KB)
What I found grepping through it, is that on the occasions that MEM-BCT-1 was used I would get RAM_CODE 0x4000471 and it would busy spin. On the less common occasion that MEM-BCT-0 was used, I would get RAM CODE 0x4000431 and it boots successfully.
[0000.297] I> RAM_CODE 0x4000471
[0000.310] I> get_binary_info: Binary name: MEM-BCT-1
[0000.327] I> BCH of MEM-BCT-1 read from storage
[0000.335] I> MEM-BCT-1 header integrity check is success
[0000.350] I> MEM-BCT-1 binary is read from storage
[0000.355] I> MEM-BCT-1 binary integrity check is success
[0000.360] I> Binary MEM-BCT-1 loaded successfully at 0x40040000 (0xe580)
[0000.366] I> RAM_CODE 0x4000471
[0000.372] I> RAM_CODE 0x4000471
[0000.476] I> Busy Spin
<reset>
[0000.297] I> RAM_CODE 0x4000431
[0000.310] I> get_binary_info: Binary name: MEM-BCT-0
[0000.327] I> BCH of MEM-BCT-0 read from storage
[0000.335] I> MEM-BCT-0 header integrity check is success
[0000.350] I> MEM-BCT-0 binary is read from storage
[0000.355] I> MEM-BCT-0 binary integrity check is success
[0000.360] I> Binary MEM-BCT-0 loaded successfully at 0x40040000 (0xe580)
[0000.366] I> RAM_CODE 0x4000431
[0000.372] I> RAM_CODE 0x4000431
... boot continues
Hi, did you check the pin design of pin J58 and K58 which are strap pins of ram code? As said, for the unbuffered strap pins, the carrier board must ensure these pins are not pulled or driven low or high during power-on to avoid affecting the strap levels.
That was it!
What is the value of the internal pullup/pulldowns on these pins? I want to avoid a board spin if I can by selecting a level shifter with a higher output hold impedance.