Jetson TX2 NVMe Hotplug/Hotswap

Hello,
I am currently trying to use Docks for NVMe SSDs to make it possible to hotplug or hotswap them. We would like the TX2 to process and record huge amounts of data which makes swapping the SSDs necessary.

When I boot the System ( Jetson TX2 Development Kit, L4T 32.3.1 ) without the NVMe attached and then attach it nothing is found. Re-enumerating the PCIe bus ( echo 1 > /sys/bus/pci/rescan) does not work and neither does setting the Linux bootargs pcie_aspm=off and pci=pcie_bus_perf Linux bootargs pcie_aspm=off and pci=pcie_bus_perf.

lspci -vvv
ls /sys/class/nvme/

When booting with the NVMe attached it is mountable but if I remove and reattach it it is visible again via lspci but the /dev/nvme device is gone and there is nothing under /sys/class/nvme/ so I think the nvme driver was not loaded again. Re-enumerating did again not work as did setting the recommended bootargs.

Before Hotswap:

user@tx2:~$ sudo lspci -v
00:01.0 PCI bridge: NVIDIA Corporation Device 10e5 (rev a1) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0, IRQ 381
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
	Memory behind bridge: 40100000-401fffff
	Capabilities: [40] Subsystem: NVIDIA Corporation Device 0000
	Capabilities: [48] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/2 Maskable- 64bit+
	Capabilities: [60] HyperTransport: MSI Mapping Enable- Fixed-
	Capabilities: [80] Express Root Port (Slot+), MSI 00
	Capabilities: [100] Advanced Error Reporting
	Kernel driver in use: pcieport

01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981 (prog-if 02 [NVM Express])
	Subsystem: Samsung Electronics Co Ltd Device a801
	Flags: bus master, fast devsel, latency 0, IRQ 381
	Memory at 40100000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [70] Express Endpoint, MSI 00
	Capabilities: [b0] MSI-X: Enable+ Count=33 Masked-
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [148] Device Serial Number 00-00-00-00-00-00-00-00
	Capabilities: [158] Power Budgeting <?>
	Capabilities: [168] #19
	Capabilities: [188] Latency Tolerance Reporting
	Capabilities: [190] L1 PM Substates
	Kernel driver in use: nvme

After Hotswap:

user@tx2:~$ sudo lspci -v
00:01.0 PCI bridge: NVIDIA Corporation Device 10e5 (rev a1) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0, IRQ 381
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
	Memory behind bridge: 40100000-401fffff
	Capabilities: [40] Subsystem: NVIDIA Corporation Device 0000
	Capabilities: [48] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/2 Maskable- 64bit+
	Capabilities: [60] HyperTransport: MSI Mapping Enable- Fixed-
	Capabilities: [80] Express Root Port (Slot+), MSI 00
	Capabilities: [100] Advanced Error Reporting
	Kernel driver in use: pcieport

01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981 (prog-if 02 [NVM Express])
	Subsystem: Samsung Electronics Co Ltd Device a801
	Flags: fast devsel, IRQ 381
	Memory at 40100000 (64-bit, non-prefetchable) [disabled] [size=16K]
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [70] Express Endpoint, MSI 00
	Capabilities: [b0] MSI-X: Enable- Count=33 Masked-
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [148] Device Serial Number 00-00-00-00-00-00-00-00
	Capabilities: [158] Power Budgeting <?>
	Capabilities: [168] #19
	Capabilities: [188] Latency Tolerance Reporting
	Capabilities: [190] L1 PM Substates

Does anybody have any experience with hotplugging NVMe SSDs. Can I somehow force the nvme driver to probe again?

Thanks
Johannes

Seems the kernel is built without support for PCI hotplug:

gunzip -c /proc/config.gz | grep CONFIG_HOTPLUG_PCI
# CONFIG_HOTPLUG_PCI is not set

Not sure it would work, but you would first try to rebuild a kernel enabling it.

Thanks very much. I totally missed that. Will try to rebuild the kernel.

I rebuilt it with CONFIG_HOTPLUG_PCI but it still behaves the same. If I boot the system without the NVMe attached and attach it no device shows up with lspci and there is no new output in the dmesg’s.

user@tx2:~$ gunzip -c /proc/config.gz | grep CONFIG_HOTPLUG_PCI
CONFIG_HOTPLUG_PCI_PCIE=y
CONFIG_HOTPLUG_PCI=y
CONFIG_HOTPLUG_PCI_CPCI=y
CONFIG_HOTPLUG_PCI_SHPC=y

Hotswapping seems to work now. Hotplugging not.

If I boot the device with the NVMe attached and then remove and reattach it, lspci shows the device but the nvme driver does not seem to be loaded. If i remove and rescan the pcie port the driver seems to reload and i can mount the NVMe again.

echo 1 > /sys/bus/pci/devices/0000\:00\:01.0/remove
echo 1 > /sys/bus/pci/rescan

CONFIG_HOTPLUG_PCI, CONFIG_HOTPLUG_PCI_PCIE, pcie_aspm=off and pci=pcie_bus_perf are all set.

I tried using the patch from Force rescan of PCIe bus? - #6 by vidyas and now lspci shows the the PCI bridges and now it detects some device and outputs something to dmesg when I do a rescan but lspci does not show useful information and there is still not /dev/nvme device.

Before attaching NVMe:

user@tx2:~$ lspci
00:01.0 PCI bridge: NVIDIA Corporation Device 10e5 (rev a1)
00:03.0 PCI bridge: NVIDIA Corporation Device 10e6 (rev a1)

After attaching NVMe and executing echo 1 > /sys/bus/pci/rescan

user@tx2:~$ lspci
pcilib: Cannot open /sys/bus/pci/devices/0000:02:00.0/config
lspci: Unable to read the standard configuration space header of device 0000:02:00.0
00:01.0 PCI bridge: NVIDIA Corporation Device 10e5 (rev a1)
00:03.0 PCI bridge: NVIDIA Corporation Device 10e6 (rev a1)

The dmesg output when I remove and rescan the PCI bridge:

[   83.832311] pcie_pme 0000:00:01.0:pcie001: unloading service driver pcie_pme
[   83.832401] aer 0000:00:01.0:pcie002: unloading service driver aer
[   83.832464] pci_bus 0000:01: busn_res: [bus 01] is released
[   83.832563] iommu: Removing device 0000:00:01.0 from group 55
[   94.948917] pci 0000:00:01.0: [10de:10e5] type 01 class 0x060400
[   94.949044] pci 0000:00:01.0: PME# supported from D0 D1 D2 D3hot D3cold
[   94.949308] iommu: Adding device 0000:00:01.0 to group 55
[   94.949317] arm-smmu: forcing sodev map for 0000:00:01.0
[   94.949559] pci_bus 0000:01: busn_res: can not insert [bus 01-ff] under [bus 00-ff] (conflicts with (null) [bus 02])
[   94.949566] pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01
[   94.949636] pci 0000:02:00.0: [144d:a808] type 00 class 0x010802
[   94.949694] pci 0000:02:00.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit]
[   94.950267] iommu: Adding device 0000:02:00.0 to group 57
[   94.950274] arm-smmu: forcing sodev map for 0000:02:00.0
[   94.950368] pci_bus 0000:02: busn_res: [bus 02] end is updated to 02
[   94.950669] pcieport 0000:00:01.0: Signaling PME through PCIe PME interrupt
[   94.950676] pcie_pme 0000:00:01.0:pcie001: service driver pcie_pme loaded
[   94.950782] aer 0000:00:01.0:pcie002: service driver aer loaded

hot-plug / hot-swap is not supported in TX2. Although hot-swap may work, it is not advisable to do it the way you are doing it as the HW is not being taken through a clean path.
The best we can do here is to compile the PCIe host controller driver as a module and perform unload of it before removing the device and then load it again after connecting a different device. This should work without any issues.

Hi,
thanks very much @vidyas. This seems to work. I rebuilt the Kernel with CONFIG_PCI_TEGRA=m, CONFIG_HOTPLUG_PCI=y and CONFIG_HOTPLUG_PCI_PCIE=y now I can hotplug the NVMe. It is not perfect as we would always need to write to one NVMe so unloading the whole module and reloading it is a little bit problematic but I think we can make this work with some buffers.

user@tx2:~$ lspci

user@tx2:~$ sudo insmod /home/user/Downloads/pci-tegra.ko 

user@tx2:~$ lspci
00:01.0 PCI bridge: NVIDIA Corporation Device 10e5 (rev a1)
00:03.0 PCI bridge: NVIDIA Corporation Device 10e6 (rev a1)
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981

user@tx2:~$ ls /sys/class/nvme/
nvme0

@vidyas thanks for the solution with unloading the driver but we have problems with it because there are some other pcie devices connected to which the connection cannot be interrupted. Would using the Xavier Platform make a difference? Do you think it would be possible there to hotswap / hotplug NVMe s?

Well, in the case of Xavier platforms, we have debugfs based hot_plug and planned hot_unplug available.
One can execute
cat /sys/kernel/debug/pcie-x/hot_plug
to get the device enumerated and execute
cat /sys/kernel/debug/pcie-x/hot_unplug
before physically removing the device.
In both the above commands, replace ‘x’ with the controller number under which the NVMe device is connected. That way, the PCIe hierarchies of other controllers won’t get disturbed.
Please do note that we still don’t have support for the unplanned/surprise device removal and also if the NVMe device is connected behind a PCIe switch, then, all the other devices under that PCIe switch get removed and re-enumerated with the above commands.

Thank you @vidyas.

Hotswapping works just like you described but when I boot the system without the NVMe attached and attach it there is no /sys/kernel/debug/pcie-x directory.

Well, that is because the root port would be powered down if there is no link-up during the boot time. If you want the root port to be enumerated irrespective of the link-up, please remove “nvidia,enable-power-down;” from the respective DT node entry.

1 Like