Non-Transparent Bridge

boltzman · May 2, 2020, 12:32am

Hello,
I have a system where I want a Jetson TX2 to access 3 NVMe-SSDs (going through a PCIe switch) using a Non-Transparent port (another device is the actual root).
I am fairly new to it but I have been reading up about NTB and I don’t really see how the GPU could possible get to see all 3 SSDs through that port. As far as I know software uses the PCI capabilities and other information in the PCI configuration space to determine an endpoint to be an NVMe device, and therefore use the correct driver… Configuration requests are routed using BUS:DEVICE:FUNCTION and devices are discovered during enumeration. While it is clear to me how address translation occurs and how the NT Host can access address space in the other domain, I am not so sure about Configuration Requests…

I mean… Upon boot, my NT device will start generating configuration request to enumerate the system, If the NT Bridge forwards that there will probably be collisions with the other Root’s enumeration… but if it doesn’t there is no way my NT device’s OS will ever be aware of those SSDs on the other side… therefore they will not be detected by the OS or anything…
My guess is that the OS in the NT device needs to be made aware of this scheme so it can invoke the correct drivers… But this is only a guess

My question is, has anybody done something similar with this module (or a similar one)? Any advice or information would be greatly appreciated :)

vidyas · May 4, 2020, 12:18pm

I’m wondering how is GPU coming into picture here? or is it a typo and you really meant CPU and not GPU?

IIUC the configuration you are trying to use, It has TX2 and one PCIe switch with the upstream port of the PCIe switch connected to the TX2 (lets call it Host-1) and it probably has 4 downstream ports, of which 3 downstream ports are connected with NVMe drives and one port is configured as an NT port and another root port system which could be another TX2 or an x86 system (call it Host-2).
In this setup, Host-1 will see all NVMe drives and the Host-2 also through NT port. Accessing all NVMe drives through the PCIe switch is a regular affair for Host-1, is that what you want to know more about?
Or are you looking for doing the same from Host-2? AFAIK, I don’t think it is possible to be able to access NVMe drives from Host-2.
Let me know what you are looking for exactly.

boltzman · May 4, 2020, 6:30pm

Hi @vidyas, thanks a lot for your reply.
Yes sorry, when I said “GPU” I ment the GPU Module (the TX2 module), so I was referring to that as a Host. Sorry but that is what we call it internally, we might be abusing language.
The configuration you are assuming is mostly correct, the only difference is that I have certain flexibility deciding which Host will be the actual Root of the system and which host will access through the NT port (that can be modified by configuring the switch).
As you said, accessing the SSDs from the actual Root of the tree should be business as usual, however I wonder how a host connected through an NT port could do that. The switch allows me to map BARs from all over the hierarchy to the Host behind the NT port, but I just don’t know if mapping BARs is enough to have the TX2 access the SSDs…

Why do you think that is impossible? I am very surprised, if so, how do people go about accessing NVMe SSDs from two PCIe Root Hosts?

vidyas · May 6, 2020, 6:20am

Well, with some P2P support, we should be able to map the NVMe’s BAR to the host behind NT port, but, to be able to view it as a typical NVMe device (as in a PCIe device under the hierarchy) is still not possible, as NT is nothing but two endpoints connected back-to-back and the host behind NT port though can see the BARs, can’t really see the hierarchy.
But, yeah, if we have a framework that can map NVMe BAR to the host behind NT port, then, we should be able to work with it (probably without using the interrupts??)
Also, what would happen to the host under which this NVMe device already got enumerated? do we have to remove the NVMe driver from that host, just leaving the memory and bus-mastering enabled in the config space?
Well, to me, it looks like it seems not so straight forward to get it working using the standard mechanisms. But, yeah, things seem possible with some out of the box stuff.
We haven’t tried these internally and can’t really comment on the ultimate feasibility of this configuration.