where you could naïvely think that changing 512 to 4096 would work… but surprisingly, it doesn’t.
It seems that the assumption that sectors are 512 bytes is hard-coded into one or more of the flashing tools and scripts, and we think is hard-coded into the bootloader chain as well.
Reformatting the NVMe to use 512-byte sectors is the simple fix, of course.
However, supporting 4096-byte sectors can be very important for many applications, especially through the device mapper, and we wanted to make sure that NVIDIA was aware of the issue.
I have been unable to flash any NVMe that has 4096-byte sectors (the SK hynix Platinum P41 2TB is my main test model)
However, the same NVMe, when reformatted to 512-byte sectors, flashes just fine.
I can switch between “can flash” and “cannot flash” merely by reformatting the NVMe using
nvme format --lbaf=0 /dev/nvme0n1 # sets 512-byte sectors
nvme format --lbaf=1 /dev/nvme0n1 # sets 4096-byte sectors
and rebooting, with my model SSD.
I wish I had saved the exact logs, but one of the several errors came from computing the GPT secondary table entry offsets… but there was more than one error from the flashing tool.
Note that if flashed to eMMC, the Linux kernel and devices have no problem with the 4096-byte sector device, even under synthetic high load.
And again, I will emphasize that I tested flashing with byte-identical, pristineLinux_for_Tegra directories, with only the layout sector_size changed.
Also note that I get identical failures if I edit the layout file to explicitly set num_sectors to the number of sectors reported by the drive, or implicitly by keeping num_sectors="NUM_SECTORS".
This is the exact same hardware, no changes at all, not even in the connected USB cables.
Tested NVMe drives, with identical results for both, are:
If you modify sector size from 512 → 4096, please also help to modify the num_sectors.
For example, if the num_sectors reported by the drive is 488397168 when the sector_size is 512. Please use 61049646 (488397168/8) when the sector size is 4096.
[ 0.5494 ] Partition primary_gpt size 19968: too small for GPT header and GPT Entries
Error: Return value 4
This error is caused from the size of primary_gpt, you could try to modify this size.
I’ve checked with internal. We don’t support using sector_size other than 512 for eMMC/SD/NVMe/USB since the general sector size is 512.
If you have specific use case for sector size to be 4096.
You could try to modify the XML and adding <align_boundary> 4096 </align_boundary> to every partitions to make sure it is 4KiB aligned, and the size should be the multiple of 4096.
We’ve not verified this use case and also can not guarantee if it may affect the image-based OTA so that we don’t suggest to use sector size as 4096.
We will not be using 4096-byte sectors unless it is “officially supported”.
There is too much a chance of something going wrong later, otherwise.
The use-case for 4096-byte sectors is that it is much, much more efficient:
for many LVM block device operations,
large file random-access, such as database files,
LUKS encryption, which occurs at the block level.
Also, many (possibly most) NVMe drives these days use 4096-byte physical sectors, so you get different wear-patterns on the flash media using the “fake” 512-byte sectors.
The claim that “… the general sector size is 512.” is demonstrably false for almost all modern flash. Controller firmware merely does “LBA-Like” translation to accommodate ancient controllers and operating systems.
In more concrete terms, the NVMe will have to have plenty of RAM cache to help coalesce the 512 byte transfers!
If you want to see a specific example of the performance differences that can be observed, look at the performance of XFS or ZFS on 4096 vs 512 byte sectors. (Yes, I know these are not supported on Jetson platforms, but there’s a lot of experience with these in the field!)
For such a cutting-edge, modern product, it is almost … unbelievable … in this day and age!
Dear @KevinFFF
I hope you don’t mind my interjection. I wanted to inquire if there are any upcoming plans for official support of 4096-byte sectors in the near future?
Thank you in advance for your response
Sorry that it seems a huge modification for the partition size. Currently, the offset of some partitions is fixed and not 4096 aligned. It may affect many features like OTA updates and layout change so that we don’t have plan for this use case recently.
Perhaps a related, but simpler ask is: could NVIDIA please add a check to the partition layout manager so that if the sector_size in the flash layout xml is not 512, a useful error message is printed?
That would have saved a lot of debugging time by multiple developers, and it’s clearly affecting more than one customer.