Increasing size of BAR0 to 2GB in Endpoint Mode

Endpoint: Jetson AGX Orin on custom carrier, R35.6
Root: Jetson AGX Orin on custom carrier, R35.6

Hello :
I successfully modified PCIe bar0_size from 64KB to 64MB by making the following changes:

  • Modify the CONFIG_CMA_SIZE_MBYTES parameter in the EP kernel defconfig from 64 to 128;

  • Modify the macro BAR0_SIZE in the EP driver pci-epf-nv-test.c to SZ_64M.

When I want to change bar0_size to 2GB, in addition to the above similar operations, the following modifications are also added:

  • After decompiling the RP device tree (dtb), change the “ranges” in the node “pcie@141a0000” from
ranges = <0x81000000 0x00 0x3a100000 0x00 0x3a100000 0x00 0x100000 0x82000000 0x00 0x40000000 0x2b 0x28000000 0x00 0x8000000 0xc3000000 0x27 0x40000000 0x27 0x40000000 0x03 0xe8000000>;

to

ranges = <0x81000000 0x00 0x3a100000 0x00 0x3a100000 0x00 0x00100000 0x82000000 0x00 0x40000000 0x2a 0x700000000 0x0 0xc0000000 0xc3000000 0x27 0x40000000 0x27 0x40000000 0x3 0x30000000>;

But this modification does not work properly.How can I change bar0_size to 2GB?

Hi,

If you are designing a custom base board, then it means some adaptation configurations are needed.
Otherwise, your board may not work fine.

For Orin AGX series, you could refer to below document
https://docs.nvidia.com/jetson/archives/r36.3/DeveloperGuide/HR/JetsonModuleAdaptationAndBringUp/JetsonAgxOrinSeries.html?highlight=universal%20serial%20bus#jetson-agx-orin-platform-adaptation-and-bring-up
(please be aware that above link is for rel-36.3/jetpack6.0)

This document includes below configuration

  1. pinmux change & GPIO configuration
  2. EEPROM change as most custom boards do not have an EEPROM on it.
  3. Kernel porting
  4. PCIe configuration
  5. USB configuration
  6. MGBE configuration
  7. RGMII configuration

Thanks!

Yes, the above operations have been done and checked.

The ranges cannot be changed. If you change the BAR0_SIZE in EP driver to SZ_2G but it cannot work, it means this does not support to 2G BAR0 size.

I saw another post in the forum, he used Jetson AGX Xavier as EP and x86 pc as RP:
https://forums.developer.nvidia.com/t/increasing-size-of-bar0-in-endpoint-mode/232498
Why can AGX Xavier modify bar0_size, but AGX Orin can’t?

That Xavier thing is also just a test patch. Not official support.

AMAP per controller must be within a fixed range so you cannot configure that “ranges” randomly.

Could you please help provide an unofficial supported “test patch” for AGX Orin?

Already checked with internal team before, no support for this.

OK, thank you. In addition, besides using bar0, is there any other way to use PCIe to transfer large amounts of data? Now bar0_size is only 64MB, which is too small.

Hi,

Our team has once enlarged the BAR0 size to 512M by just changing BAR0_SIZE in driver. What error happened on your side that can only enable 64M?

This patch for jetpack5. You may try it first.

diff --git a/kernel/nvidia/drivers/pci/endpoint/functions/pci-epf-nv-test.c b/kernel/nvidia/drivers/pci/endpoint/functions/pci-epf-nv-test.c
index 1959c8402..e21913418 100644
--- a/kernel/nvidia/drivers/pci/endpoint/functions/pci-epf-nv-test.c
+++ b/kernel/nvidia/drivers/pci/endpoint/functions/pci-epf-nv-test.c
@@ -17,7 +17,7 @@
 #include <linux/pci-epf.h>
 #include <linux/version.h>
 
-#define BAR0_SIZE SZ_64K
+#define BAR0_SIZE SZ_512M
 
 struct pci_epf_nv_test {
 	struct pci_epf_header header;
@@ -47,7 +47,7 @@ static int pci_epf_nv_test_core_init(struct pci_epf *epf)
 	epf_bar->size = BAR0_SIZE;
 	epf_bar->barno = BAR_0;
 	epf_bar->flags |= PCI_BASE_ADDRESS_SPACE_MEMORY |
-				PCI_BASE_ADDRESS_MEM_TYPE_32;
+			  PCI_BASE_ADDRESS_MEM_TYPE_32;
 
 	ret = pci_epc_set_bar(epc, epf->func_no, epf_bar);
 	if (ret) {
@@ -92,7 +92,6 @@ static void pci_epf_nv_test_unbind(struct pci_epf *epf)
 	struct pci_epf_nv_test *epfnv = epf_get_drvdata(epf);
 	struct pci_epc *epc = epf->epc;
 	struct device *cdev = epc->dev.parent;
-	struct iommu_domain *domain = iommu_get_domain_for_dev(cdev);
 #if (LINUX_VERSION_CODE > KERNEL_VERSION(4, 15, 0))
 	struct pci_epf_bar *epf_bar = &epf->bar[BAR_0];
 #endif
@@ -103,10 +102,8 @@ static void pci_epf_nv_test_unbind(struct pci_epf *epf)
 #else
 	pci_epc_clear_bar(epc, BAR_0);
 #endif
-	vunmap(epfnv->bar0_ram_map);
-	iommu_unmap(domain, epfnv->bar0_iova, PAGE_SIZE);
-	iommu_dma_free_iova(cdev, epfnv->bar0_iova, BAR0_SIZE);
-	__free_pages(epfnv->bar0_ram_page, 1);
+	dma_free_coherent(cdev, BAR0_SIZE, epfnv->bar0_ram_map,
+			  epfnv->bar0_iova);
 }
 
 static int pci_epf_nv_test_bind(struct pci_epf *epf)
@@ -118,8 +115,8 @@ static int pci_epf_nv_test_bind(struct pci_epf *epf)
 #endif
 	struct device *fdev = &epf->dev;
 	struct device *cdev = epc->dev.parent;
-	struct iommu_domain *domain = iommu_get_domain_for_dev(cdev);
 	int ret;
#if  0	
+	int i;
#endif

 #if (LINUX_VERSION_CODE <= KERNEL_VERSION(4, 15, 0))
 	ret = pci_epc_write_header(epc, header);
@@ -129,40 +126,29 @@ static int pci_epf_nv_test_bind(struct pci_epf *epf)
 	}
 #endif
 
-	epfnv->bar0_ram_page = alloc_pages(GFP_KERNEL, 1);
-	if (!epfnv->bar0_ram_page) {
-		dev_err(fdev, "alloc_pages() failed\n");
-		ret = -ENOMEM;
-		goto fail;
-	}
-	dev_info(fdev, "BAR0 RAM phys: 0x%llx\n",
-		 page_to_phys(epfnv->bar0_ram_page));
+	epfnv->bar0_ram_map = dma_alloc_coherent(cdev, BAR0_SIZE,
+					&epfnv->bar0_iova, GFP_KERNEL);
 
-	epfnv->bar0_iova = iommu_dma_alloc_iova(cdev, BAR0_SIZE,
-						cdev->coherent_dma_mask);
-	if (!epfnv->bar0_iova) {
-		dev_err(fdev, "iommu_dma_alloc_iova() failed\n");
+	if (!epfnv->bar0_ram_map) {
+		dev_err(fdev, "dma_alloc_coherent() failed\n");
 		ret = -ENOMEM;
-		goto fail_free_pages;
+		return ret;
 	}
 
 	dev_info(fdev, "BAR0 RAM IOVA: 0x%08llx\n", epfnv->bar0_iova);
-
-	ret = iommu_map(domain, epfnv->bar0_iova,
-			page_to_phys(epfnv->bar0_ram_page),
-			PAGE_SIZE, IOMMU_READ | IOMMU_WRITE);
-	if (ret) {
-		dev_err(fdev, "iommu_map(RAM) failed: %d\n", ret);
-		goto fail_free_iova;
-	}
-	epfnv->bar0_ram_map = vmap(&epfnv->bar0_ram_page, 1, VM_MAP,
-				   PAGE_KERNEL);
-	if (!epfnv->bar0_ram_map) {
-		dev_err(fdev, "vmap() failed\n");
-		ret = -ENOMEM;
-		goto fail_unmap_ram_iova;
+	dev_info(fdev, "BAR0 RAM VA  : 0x%p\n", epfnv->bar0_ram_map);
+	dev_info(fdev, "BAR0 RAM PA  : 0x%08lx\n",
+		 (vmalloc_to_pfn(epfnv->bar0_ram_map) << PAGE_SHIFT));
+
+#if 0
+/* Ideally we shouldn’t use devmem on the PHY address from DRAM.
+ * modify the data in EP BAR0 using virtual address(epfnv->bar0_ram_map) 
+ * in the driver and then read out using BAR0 address in RP.
+ */
+	for(i = 0; i < BAR0_SIZE; i+= PAGE_SIZE) {
+		*((unsigned long *)(epfnv->bar0_ram_map + i)) = 0x36578912;
 	}
-	dev_info(fdev, "BAR0 RAM virt: 0x%p\n", epfnv->bar0_ram_map);
+#endif
 
 #if (LINUX_VERSION_CODE <= KERNEL_VERSION(4, 15, 0))
 	ret = pci_epc_set_bar(epc, BAR_0, epfnv->bar0_iova, BAR0_SIZE,
@@ -170,7 +156,7 @@ static int pci_epf_nv_test_bind(struct pci_epf *epf)
 			      PCI_BASE_ADDRESS_MEM_TYPE_32);
 	if (ret) {
 		dev_err(fdev, "pci_epc_set_bar() failed: %d\n", ret);
-		goto fail_unmap_ram_virt;
+		goto fail_set_bar;
 	}
 #endif
 
@@ -182,17 +168,11 @@ static int pci_epf_nv_test_bind(struct pci_epf *epf)
 	return 0;
 
 #if (LINUX_VERSION_CODE <= KERNEL_VERSION(4, 15, 0))
-fail_unmap_ram_virt:
-	vunmap(epfnv->bar0_ram_map);
-#endif
-fail_unmap_ram_iova:
-	iommu_unmap(domain, epfnv->bar0_iova, PAGE_SIZE);
-fail_free_iova:
-	iommu_dma_free_iova(cdev, epfnv->bar0_iova, BAR0_SIZE);
-fail_free_pages:
-	__free_pages(epfnv->bar0_ram_page, 1);
-fail:
+fail_set_bar:
+	dma_free_coherent(cdev, BAR0_SIZE, epfnv->bar0_ram_map,
+			  epfnv->bar0_iova);
 	return ret;
+#endif
 }
 
 static const struct pci_epf_device_id pci_epf_nv_test_ids[] = {
-- 

Hi,
We have already used this patch.If we only enlarge BAR0 size to 512M in driver, it could not get enouge memory space on the rp side and could not be recongized correctlly.Actually, as long as the BAR0 size is bigger than 64M,for example ,changing BAR0 to 128M ,the kernel message will report the error " BAR0: no space for xxxxxx"*.

Hi,
we have used this patch before and have done some test.

and now we can use 64M,our test result is different from yours.

Hi.
I am wondering that why you team can enlarge the BAR0 size to 512M without changing rp dtb?The dtb of rp side limited the max size of Bar0 to128M on orin.

I can not recur your test result by only changing BAR0 size in the driver and I really hope could know more information to solve this problem.

THanks.

To align the information first… which BSP version are you using there? official 35.6 without changing anything?

Maybe the driver makes the difference as what we test is on tegra-vnet

  1. The change:
diff --git a/kernel/nvidia/drivers/pci/endpoint/functions/pci-epf-tegra-vnet.c b/kernel/nvidia/drivers/pci/endpoint/functions/pci-epf-tegra-vnet.c
index 834081439..07d86a232 100644
--- a/kernel/nvidia/drivers/pci/endpoint/functions/pci-epf-tegra-vnet.c
+++ b/kernel/nvidia/drivers/pci/endpoint/functions/pci-epf-tegra-vnet.c
@@ -26,7 +26,7 @@
 #include <linux/workqueue.h>
 #include <linux/tegra_vnet.h>
 
-#define BAR0_SIZE SZ_4M
+#define BAR0_SIZE SZ_512M
 
 enum bar0_amap_type {
        META_DATA,
  1. Check in RP:
root@JAXi:/home/nvidia# lspci -vvv -s 0005:01:00.0
0005:01:00.0 Network controller: NVIDIA Corporation Device 2296
...
        Region 0: Memory at 1f40000000 (32-bit, non-prefetchable) [size=512M]
        Region 2: Memory at 1c00000000 (64-bit, prefetchable) [size=128K]
        Region 4: Memory at 1f60000000 (64-bit, non-prefetchable) [size=4K]

It is based on official 35.6, and only has little change.
For rp ,we have done some chnages:
1.pinmux change & GPIO configuration about enabling RGMII.
2.EEPROM change.
3.close MGBE configuration in dtb
4.enable RGMII configuration in dtb.

For ep ,based on rp, we reflash the board as ep(c5) according adapting document.
Add ODMDATA.
ODMDATA=“gbe-uphy-config-0,nvhs-uphy-config-1,hsio-uphy-config-16,hsstp-lane-map-3”;
flash the device.
sudo ./flash.sh jetson-agx-orin-devkit-industrial internal

Do your team have a further plan to test it on pci-epf-nv-test?

in our case ,we will use x86 host as rp, the vnet maybe not suit for x86 host.

Sorry, no.

pci-epf-nv-test is no longer supported as rel36 does not have it either.

I also use r36.3 to test pcie ep mode.
I reflashed two orin with r36.3 to test ep and rp mode.
when i run command ‘cat edmalib_test’
it works well and get expect result.
but there is another problem, i do not know how to transfer data on r36.3,it could not use ‘busybox devmem ’ to check data.
Do you have any suggestion to check or transfer data on R36.3?

Thanks a lot.