I used the M.2 Key M slot to install a Samsung V-Nand SSD 970 Pro on my AGX Xavier (Developer kit, L4T release R32.2.1).
I want to move data to/from the SSD to the GPU memory through DMA. So I found this project https://github.com/enfiskutensykkel/ssd-gpu-dma that provides an API for building userspace NVMe drivers.
I tried to run it but, I seems it can’t work with SMMU enabled. So I disabled the SMMU for PCIe controller-0 with the instructions of in comment #4 of https://devtalk.nvidia.com/default/topic/1043746/jetson-agx-xavier/pcie-smmu-issue/. After reflashing the Xavier board with the new device tree, I verified that SMMU is disabled for PCIe controller-0 by extracting current device-tree.
However, when I tried to run one example of this project (https://github.com/enfiskutensykkel/ssd-gpu-dma), I got an unhandled context fault on smmu1 and new errors (from memory controller). Bellow, the output of ‘dmesg -w’:
Hi,
The project looks specific to x86 with desktop GPUs. May not work for Xavier. So the default GPU memory is not big enough for your usecase and you need to have extra memory for CUDA processing on Xavier?
Yes, the default GPU memory is not big enough for my usecase and I need to have extra memory for CUDA processing on Xavier.
In fact, my usage consists in acquiring very high resolution images at a high speed, performing some CUDA processing and then saving the input images and the results.
The default GPU memory of the Xavier AGX could be enough to receive a sequence of input images in order to process them. But at the end of the processing, it will be necessary to be able to save (elsewhere) the input images as well as the results to free up space on the memory of the GPU for the following sequence of images. And the overall memory of the Xavier is really not enough for saving all this data.
So, we decided to insert a 1TB NVMe SSD on the M.2 Key M slot. Given our real time constraints, and the speed at which the images arrive on the Xavier, we want to be able to move the data to/from the GPU memory to the SSD using DMA.
Yes, the default GPU memory is not big enough for my usecase and I need to have extra memory for CUDA processing on Xavier. My project will be included in an embedded system, so I can’t use a x86 desktop with GPUs. I need to use an embedded board and Xavier AGX seems to be the good choice. Any idea on how I can move the data to/from the GPU memory to the SSD NVMe using DMA for my CUDA processing or to solve the mc-err errors?
But how can we completely desible the SMMU for PCIe controller-0? I modified my device tree by commenting the two instructions (shown in the comment #4 of https://devtalk.nvidia.com/default/topic/1043746/jetson-agx-xavier/pcie-smmu-issue/) in file /public_sources/harware/nvidia/soc/t19x/kernel-dts/tegra194-soc/tegra194-soc-pcie.dtsi and build the kernel from source. Next, I copied the necessary files in my host and then reflashed my Xavier board.
However, as can be seen in my comment #1, I still have an “Unhandled context fault” on smmu1. Notive that before I modify the device tree and reflash, the “Unhandled context fault” occur on smmu0.
Hi,
The modification in device tree is good. You should be able to boot up successfully.
With further investigation, we have confirm the project is not supported on Xavier. It is because PCIe P2P protocol is not supported on Xavier. And the API in nv-p2p.h is different between desktop GPUs and Jetson platforms.
I had made the necessary modifications in the project. But if the PCI P2P protocol is not supported on Xavier, the project will not work on AGX Xavier.
Could you explain me why Xavier does not support the PCI P2P protocol?
I understand that direct access to the gpu memory is not going to be possible in my case.
By inspecting the output of the “lspci -v” command, we can see that :
0000:01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981 (prog-if 02 [NVM Express])
Subsystem: Samsung Electronics Co Ltd Device a801
Flags: bus master, fast devsel, latency 0, IRQ 32
Memory at 1b40000000 (64-bit, non-prefetchable)
Capabilities: <access denied>
Kernel driver in use: nvme
The line with “Flags: bus master …” shows that my SSD can access to the system memory (unless I am mistaken).
So, would it be possible to perform DMA transfers between the system memory and my SSD (using the nvme driver by default?). If yes, could you please tell me how to do it? I’ve spent the past few days searching the internet for resources/documentations/posts on how userspace data can be copied on NVMe SSD drive through linux nvme default driver.
Hi,
We have Xavier 16GB, 8GB, TX2 8GB, 4GB modules. The platforms are mainly designed for embedded usecases. Xavier 16GB is the module with maximum memory size. From the discussion, it seems like desktop GPUs are better for this usecase. Also there is existing implementation.
Due to the limitation that PCIe P2P is not supported on Xavier, still would like to suggest you consider desktop GPUs.