M.2 nvme SSD detection issue

Hi,
We use a M.2 M key nvme SSD(Kingston OM8PDP3256B-AB1) on our carrier board.
However, Xavier NX SOM can’t detect it. (can’t find it by lspci)
Here’s what we did:

  1. Issue happens when using Xavier Devkit.
  2. Issue doesn’t happen when using Nano or TX2 NX.
  3. Issue happens when SSD disk system is NTFS.
  4. Issue doesn’t happen when SSD disk system is ext4.

Is there any suggestion for us to solve this problem?

Thanks.
Wayne

I am curious about point 3 and 4.

You are talkinga about…

  1. Use the host to create a ext4 file system on the NVMe.

  2. Put it to jetson and it will be detected.

  3. On jetson, either format this nvme or format it + create a NTFS partition.

  4. Reboot and it will not get detected???

Hi WayneWWW,

Considering this procedure(same carrier).

  1. Format the NVMe as ext4 with Jetson Nano.
  2. Change SOM to Jeston Xavier NX.
  3. Turn on system, the NVMe can be detected.
  4. Format this NVMe to NTFS.
  5. Reboot system, Xavier NX can still detect this NVMe.
  6. However, shutdown system then turn on system, Xavier NX can’t detect this one.
  7. Change SOM to Nano, Nano can detect it.
  8. Format this NVMe to ext4, then change SOM to Xavier NX, and Xavier
    can detect this NVMe.

It’s a little strange, but this is a issue what we faced.

Thanks.
Wayne

HI,

Since you mentioned Nano, I guess you are not using devkit to test. Please use devkit to test and just use NX to do the disk format work. I am not sure why you need to switch SOM here.

Hi Wayne,

  1. From the start, I’ve said when using Devkit, the issue still happens.
  2. To switch SOM is just bucause Xavier can’t detect NTFS NVMe, I need to use Nano to format NVMe to another format to check.
    I think You can ignore the portion of switching Nano.

Thanks .
Wayne

I guess you better using a clean setup. #3 test looks like from a custom board. Please just follow our procedure here.
We don’t want to debug custom board to prevent unnecessary problem.

  1. Use NX devkit + NX module. Do not switch any of SOM here. NX should always be on the devkit.

  2. I don’t care about where the ext4 NVMe comes from. Just assume you have a NVMe with ext4 partition here.

  3. Plug it to the NX devkit. Cold boot for like 20 times, would it all get detected every time?

  4. Use the NX devkit to format it to NTFS. Cold boot for another 20 times, would it all not get detected?

  5. Does this issue happen to other kind of NVMe SSD?

  6. If ext4

Hi WayneWWW,

  1. We have 2 kinds of Xavier NX module, one with SD slot is from devkit, the other has eMMC on board.
  2. Both of the 2 modules are checked on devkit carrier board.
  3. The below content is our SW team did for the test.

We download the codebase(JETPACK 4.6 REV.3) form SDK manager, the folder name is /nvidia/nvidia_sdk/JetPack_4.6_Linux_JETSON_XAVIER_NX_TARGETS/Linux_for_Tegra
and using the following command to create SD and EMMC images:
step 1: sudo ./apply_binaries.sh
step 2: sudo ./tools/jetson-disk-image-creator.sh -o sd_nx_32.6.1.img -b jetson-xavier-nx-devkit
step 3: sudo BOARDID=3668 BOARDSKU=0001 FAB=100 FUSELEVEL=fuselevel_production ./nvmassflashgen.sh jetson-xavier-nx-devkit-emmc mmcblk0p1

Then using SD image for Jetson Xavier NX (P3668-0000) Developer kit version with Jetson Xavier NX Developer Kit carrier board (P3509)
The result: Can detect NVME(both ext4 and NTFS)(check for 20 times).
Using EMMC image for Jetson Xavier NX (P3668-0001)Production version with Jetson Xavier NX Developer Kit carrier board (P3509)
The result: Can detect ext4 NVME, but NTFS NVME can’t be detected(check for 20 times).

What’s the different inside SOM between Xavier NX with SD slot and eMMC?
Thanks.

Wayne

Hi,

We cannot reproduce your issue with the NVMe drive on our side. I am still able to see device in lspci and lsblk with NTFS.

Please check if this only happens to specific NVMe drive.

Hi WayneWWW,
We’ve tried other vendors. It seems Innodisk, WD, Transcend don’t have this issue. I’ve sent a mail to Kinston FAE for this issue.
However, we still want to know:

  1. Why this issue just happens on Xavier, not on Nano and TX2.
  2. Test on devkit carrier, why there’s different result on SOM with SD slot and eMMC?
  3. Why does different disk format cause different result?

We think the above questions can be answered only by NVIDIA.

Thanks.
Wayne

Could you share the dmesg and lspci when error happened?

Hi WayneWWW,
Please check them(Data is gotten from SOM with eMMC+devkit carrier).
dmesg (67.5 KB)
lspci (6.9 KB)

Thanks.
Wayne

Hi WayneWWW,
Any update?

Thanks.
Wayne

Hi WayneWWW,
Another week passed.
Any update?

Thanks.
Wayne

Hi WayneWWW,
Any update for this issue?
Thanks.

Wayne

Hi WayneWWW,
Still not get anything from NVIDIA.
Please reply us for this issue.
Thanks.

Wayne

Hi,

This is strange, file system has no role in PCIe link up.
Try following things and let me know how it goes,

  1. Disable NVMe in bootloader.
    a. cd /Linux_for_Tegra/bootloader/
    b. Remove nvme from “boot-order” in cbo.dts
    c. dtc -I dts -O dtb -o cbo.dtb cbo.dts
    d. Add “-k CPUBL-CFG” to regular flash command

  2. Remove “nvidia,enable-power-down” from pcie@141a0000 node and flash DTB. Now you should see Tegra PCIe root port with domain=5. Get below register dump from root shell.

/home/ubuntu/reg_dump -a 0x141a00d0

Thanks,
Manikanta

Hi Manikanta,

Thanks for your response. There are two things:
First, I used the code base “jetson_linux_r32.6.1_aarch64” with Xavier NX SOM (eMMC version) and follow your steps 1 and 2. After flashing the image and booting into desktop, I can’t find the file “reg_dump” in any folder in the device.
I use the tool “busybox devmem” to dump:

busybox devmem 0x141a00d0 32

0x00000088

Second, I found the SSD(Kingston OM8PDP3256B-AB1) can be detected in Ubuntu after disabling NVME in the bootloader (Step 1). Does this change (disabling NVME in the bootloader) mean that we can’t boot with NVME?

Thanks,

Kunyang

Hi,

The purpose here is just for dumping the register, so you can also use the devmem tool from busybox too.

Hi,

  1. Need register dump when issue is observed with only step-2.

  2. No, we can boot with NVMe, but you can’t use NVMe as boot option.
    Can you continue your project without NVMe as boot option? and use it after boot in ubuntu?
    If yes, then you can continue with step-1 from comment #21.

Please provide NVMe make and model, will check internally if we have same NVMe to debug link up issue internally.

Thanks,
Manikanta

Hi Manikanta,

Thanks for your remind
For the first point. I dump the register (0x141a00d0) in two cases:
Before “nvidia,enable-power-down” removal: I got kernel panic when inputting the command “busybox devmem 0x141a00d0 32” and the log is in the attachment
kernel_panic_after_devdump.txt (5.1 KB)
After “nvidia,enable-power-down” removal: I got the below output and the NVMe device was still not found.
busybox devmem 0x141a00d0 32
0x00000018

For the second point, you are right and the interesting thing is that the NVME device “SSD(Kingston OM8PDP3256B-AB1)” can be detected no matter "nvidia,enable-power-down” was removed or not. We can access it in Ubuntu.
This NVMe detail info is https://www.harddrivebenchmark.net/hdd.php?hdd=KINGSTON%20OM8PDP3256B-AB1&id=29256

Thanks,

Kunyang