This topic can be closed. I will try to summarize what describe my journey which ended up working for us.
Firstly, to recap the original issue, the documentation at times seemed to be unclear, incomplete or contradictory. I mostly attribute this to the users lack of familiarity with the platform, documentation, and source code.
With that said, once we became frustrated due to both the documentation and our unfamiliarity with the process, we simplified the need to:
Problems:
- The Jetson Nano SoM pin configuration needs to be modified to provide different functions according to the custom carrier board used in the final product.
- How to verify the correct pinmux configuration is in place.
The preferred solution would lay out clearly and directly the steps necessary to solve the problem. That is, something similar to:
- Download the pinmux spreadsheet
- Generate the .dtsi and .dtsi files from the spreadsheet
- Download the kernel source code and the Linaro toolchain
- Replace files x.dtsi and y.dtsi with the files generated by the pinmux spreadsheet
- run [command to generate new .dtb file]
- Copy the new .dtb file to [Linux_for_Tegra]/…
- Execute “build_l4t_bup.sh” to generate a Bootloader Update Payload file
- copy the Bootloader Update Payload to the target Nano system and run /usr/sbin/l4t_payload_updater_t210 “${BUPFILE}”
- Verify SoM functionality by doing [A B C]
I’m still not entirely sure the best way to verify the configuration of each individual pin of the SoM. The best I can do currently is to run “dmesg | grep -i dtb” on the target system and verify that reported date matches the date of my updated .dtb file. ex:
$ dmesg | grep -i dtb
[ 0.212463] DTB Build time: Jul 15 2021 12:01:29
[ 0.424809] DTB Build time: Jul 15 2021 12:01:29
However, dmesg doesn’t tell me the specific configuration of an individual pin. As such, when the software developer attempts to write code to utilize the new hardware settings and it doesn’t work as expected, it’s difficult to pinpoint whether the problem is with the hardware configuration or the application code. If the developer expected pin 123 to be configured a certain way, how can we prove the hardware is configured properly? At some point, we found
$ cat /sys/kernel/debug/gpio
Which we expected to tell use the current state of the hardware. However, all attempts to update the device tree never changed the output of this command. Eventually, not finding clear answers, I resorted to brute force by analyzing the binary code of the SD card image, where I found MANY copies of the device tree binary. However, this still was not the solution, as ultimately, it was a 4MiB SPI flash chip embedded on the SoM that is responsible for loading the devicetree binary that actually gets loaded. Information regarding this SPI flash chip is VERY sparse in the documentation - in fact, in the official docs, I only recall seeing it mentioned once or twice, and the mentions didn’t clearly tie it to the functioning of the device tree. I read somewhere that bootloading from the 4MiB flash was a recent change, ie it changed in L4T Rev 32.5 - but I don’t recall where I read that at this point.
Once it was clear that finding the answer in the docs was not getting us where we needed to be, we searched the internet at large, where, through various articles and forum posts, the conventional wisdom appeared to me that the devicetree configuration is dictated on the SD card.
Thus, I resorted to brute-force changing every instance of the devicetree binary on the SD card - 9? 11? more??? copies of the binary code were replaced using the linux dd command. And even still, after being able to boot the system and verifying that there was no trace of the original devicetree binary anywhere on the sd card by using a hex editor, the system still booted showing the same “cat /sys/kernel/debug/gpio” and “dmesg | grep -i dtb” dates as the original devicetree.
Finding that the loaded binary data didn’t match every single copy of the binary data on the sd card, by this point the frustration was palpable! I finally solved this myself by running “lsblk” and looking at each block device present on the Nano and searching through the binary data, where I found 3 more copies of the original devicetree binary. My first attempt and blindly (and recklessly) trying to modify this binary data directly with dd bricked my Nano. So, at least I knew I was getting somewhere!
Eventually, after further frustration and a few more weeks of trial and error, I figured out how to at least get “dmesg | grep -i dtb” to show the correct device tree binary date.
After many weeks of working on this, I am now pretty familiar with the documentation, the process, and the source code. It’s difficult to pinpoint the specific issues with the documentation. Had I read the entirety of the documentation from start to finish, I think I could be expected to solve the problem. But reading the entirety of the docs, and comprehending the docs, would take many hours, which is ultimately what I spent to solve the original problem of “how to update the Nano devicetree”.
My solution above is broad and not specific, but it gives a general overview of the final process I am currently using to update the devicetree binary. A lot of this time could have been saved if there were a single, clear guide that specifically addressed the process to take a new Jetson Nano and change the functional hardware pinmux, from step 1 to step n. A single guide such as the one I outline above would go a long way to helping future engineers who embark upon this process.
Apologies for the long post, but it was a long process to get where I am currently from start to finish. With the update to Revision 32.6, as of yesterday, I overhauled my process from brute-force modifying the binary data on the flash memories with dd, to a process that uses the BUP process. From my own initial involvement in May until yesterday, that’s how long this process took. I would estimate no less than 100 hours of direct time working on this. A lot of that could probably be attributed to personal lack of experience and familiarity with the system. Perhaps if we had taken the normal approach to development, that is by using developing the software using the Development kit rather than directly building a custom carrier board and diving in directly to using the custom hardware, the time spent with the dev kit hardware would have given us some insight into how the system works before jumping straight to trying to modify the devicetree. Part of the problem is that the team of engineers working on this project has decades of hardware and software engineering experience, and this process is different from others that the team is already familiar with, such as the BBB or Ras. Pi.
On one hand, I am glad I had to go through this process, as I now know a lot about the internal working of the Nano OS that, if I had been able to solve this by reading a single How-To guide, I would not have learned what I know now. Whether that knowledge is useful…
Currently, I have a single script that, with just 4 files: 2 .sh files and the two .dtsi files from the pinmux spreadsheet, a single, executable file can be created that, when run on a target Nano, will update the functional pinmux reliably. If I had started with this script, and known a way to verify whether the devicetree update process was successful, I could now reliably change the devicetree pinmux within 10 minutes of exporting the .dtsi files from the pinmux spreadsheet. My 100+ hours of labor allowed me to condense the entire process down to a .sh script which automates the entire process of building the devicetree binary.
My current process is:
- Download pinmux spreadsheet
- Set up pinmux spreadsheet for my target platform and export the .dtsi and .dtsi files
- Run my build.sh script, which will manage everything between step 2 and step 4, including downloading necessary packages from Nvidia
- copy resultant executable file built by build.sh to a target Nano device
- Run the executable produced by build.sh
- reboot Nano
- verify expected “dmesg | grep -i dtb” date
The entire process from step 3 to step 7 takes about 5 minutes, depending on development hardware, internet speeds, and number of files that need to be downloaded (files are cached so they need only be downloaded the first time).