How do I increase size of BAR0 in Endpoint Mode on Orin NX

Hi Nvidia,

I’m following this topic to modify pci-epf-nv-test.c driver.
https://forums.developer.nvidia.com/t/increasing-size-of-bar0-in-endpoint-mode/232498

But I have same question od this topic.
https://forums.developer.nvidia.com/t/in-pcie-endpoint-and-rootpoint-modes-after-increasing-the-bar0-size-how-to-test-the-data-communication/261514

I can’t see the BAR0 RAM physical address, too.

wilson@OrinNX-EP:~$ sudo dmesg | grep pci_epf_nv_test
[  522.603944] pci_epf_nv_test pci_epf_nv_test.0: BAR0 RAM IOVA: 0xe0000000

How can I check the BAR0 size and test the data communication.

There should be such code snippet.

define BAR0_SIZE SZ_64K

Hi Wayne,

Yes, I have modified it to SZ_512M.
But I can’t see the BAR0 RAM physical address after I modified.

Is this still an issue to support? Any result can be shared?

Hi Kayccc,

Thanks reply.

I’m following this topic to add patch, and use iommu_iova_to_phys() to get physical address.

But in my test. The size of BAR0 is still very small not 512MB.

I write 0xFF to it. It looks just only 4KB.

$ sudo dmesg | grep pci_epf_nv_test
[   89.794238] pci_epf_nv_test pci_epf_nv_test.0: BAR0 RAM IOVA: 0xe0000000
[   89.794245] pci_epf_nv_test pci_epf_nv_test.0: BAR0 RAM phys: 0x10a2c5000

$ sudo busybox devmem 0x10a2c5000
0xFFFFFFFF
$ sudo busybox devmem 0x10a2c5FFF
0x000000FF

Hi Kayccc,

I got cbb error when I get devmem in the RP side.

[  103.388110] WARNING: CPU: 0 PID: 0 at drivers/soc/tegra/cbb/tegra234-cbb.c:577 tegra234_cbb_isr+0x130/0x170
[  103.398120] Modules linked in: bnep(E) nvidia_modeset(OE) fuse(E) lzo_rle(E) lzo_compress(E) zram(E) ramoops(E) reed_solomon(E) hid_logitech_hidpp(E) input_leds(E) binfmt_misc(E) snd_soc_tegra186_asrc(E) snd_soc_tegra186_arad(E) snd_soc_tegra210_iqc(E) snd_soc_tegra210_ope(E) snd_soc_tegra186_dspk(E) snd_soc_tegra210_mvc(E) snd_soc_tegra210_afc(E) snd_soc_tegra210_dmic(E) snd_soc_tegra210_adx(E) snd_soc_tegra210_mixer(E) snd_soc_tegra210_amx(E) snd_soc_tegra210_i2s(E) snd_soc_tegra210_admaif(E) snd_soc_tegra210_sfc(E) snd_soc_tegra_pcm(E) aes_ce_blk(E) crypto_simd(E) cryptd(E) aes_ce_cipher(E) ghash_ce(E) sha2_ce(E) hid_logitech_dj(E) sha256_arm64(E) sha1_ce(E) snd_soc_tegra_machine_driver(E) snd_soc_spdif_tx(E) snd_soc_tegra210_adsp(E) userspace_alert(E) snd_soc_tegra_utils(E) snd_soc_tegra210_ahub(E) snd_soc_simple_card_utils(E) tegra_bpmp_thermal(E) tegra210_adma(E) nvadsp(E) spi_tegra114(E) r8168(E) nvidia(OE) loop(E) ina3221(E) pwm_fan(E) nvgpu(E) nvmap(E) ip_tables(E) x_tables(E)
[  103.398180]  [last unloaded: mtd]
[  103.398185] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W  OE     5.10.120-tegra #1
[  103.398186] Hardware name: Unknown NVIDIA Orin NX Developer Kit/NVIDIA Orin NX Developer Kit, BIOS 4.1-33958178 08/01/2023
[  103.398188] pstate: 60400089 (nZCv daIf +PAN -UAO -TCO BTYPE=--)
[  103.398190] pc : tegra234_cbb_isr+0x130/0x170
[  103.398192] lr : tegra234_cbb_isr+0x10c/0x170
[  103.398193] sp : ffff800010003e10
[  103.398194] x29: ffff800010003e10 x28: ffffa8e91a1326c0
[  103.398196] x27: 0000000000000001 x26: 0000000000000080
[  103.398199] x25: ffffa8e919b37f48 x24: ffffa8e91a49ee40
[  103.398201] x23: ffffa8e919e27008 x22: 0000000000000019
[  103.398203] x21: ffffa8e91a2bf548 x20: 0000000000000002
[  103.398205] x19: ffffa8e91a2bf538 x18: 0000000000000010
[  103.398207] x17: 0000000000000000 x16: ffffa8e918404db0
[  103.398209] x15: ffffa8e91a132c30 x14: ffffffffffffffff
[  103.398211] x13: ffff800090003917 x12: ffff80001000391f
[  103.398213] x11: 0101010101010101 x10: 7f7f7f7f7f7f7f7f
[  103.398215] x9 : ffff800010003c30 x8 : 2a2a2a2a2a2a2a2a
[  103.398218] x7 : 2a2a2a2a2a2a2a09 x6 : c0000000ffffefff
[  103.398220] x5 : ffff42d42e7be958 x4 : ffffa8e91a147a88
[  103.398222] x3 : 0000000000000001 x2 : ffffa8e91859a830
[  103.398224] x1 : ffffa8e91a1326c0 x0 : 0000000100010001
[  103.398226] Call trace:
[  103.398229]  tegra234_cbb_isr+0x130/0x170
[  103.398233]  __handle_irq_event_percpu+0x60/0x2a0
[  103.398235]  handle_irq_event_percpu+0x3c/0x90
[  103.398237]  handle_irq_event+0x4c/0xf0
[  103.398239]  handle_fasteoi_irq+0xbc/0x170
[  103.398241]  generic_handle_irq+0x3c/0x60
[  103.398242]  __handle_domain_irq+0x6c/0xc0
[  103.398245]  gic_handle_irq+0x64/0x130
[  103.398246]  el1_irq+0xd0/0x180
[  103.398251]  cpuidle_enter_state+0xb4/0x400
[  103.398252]  cpuidle_enter+0x3c/0x50
[  103.398254]  call_cpuidle+0x40/0x70
[  103.398256]  do_idle+0x1fc/0x260
[  103.398257]  cpu_startup_entry+0x28/0x70
[  103.398261]  rest_init+0xd8/0xe4
[  103.398266]  arch_call_rest_init+0x14/0x1c
[  103.398267]  start_kernel+0x4f8/0x52c
[  103.398269] ---[ end trace 70c4fb172cf5f3b8 ]---

I’m not very sure the ranges of pcie_c4_rp in the tegra234-soc-pcie.dtsi.
I’m using Orin NX 16GB module.
Cloud you help us?

ranges = <0x81000000 0x00 0x36100000 0x00 0x36100000 0x0 0x00100000     /* downstream I/O (1MB) */
          0x82000000 0x00 0x40000000 0x24 0x28000000 0x0 0x20000000     /* non-prefetchable memory (512MB) */
          0xc3000000 0x21 0x40000000 0x21 0x40000000 0x2 0xe8000000>;    /* prefetchable memory (11904MB) */

please apply this patch

Nothing else is needed to be changed. The ranges cannot be adjusted either. They are fixed value on Orin.


diff --git a/drivers/pci/endpoint/functions/pci-epf-nv-test.c b/drivers/pci/endpoint/functions/pci-epf-nv-test.c
index 8b2a1dcecab8..d3496a199842 100644
--- a/drivers/pci/endpoint/functions/pci-epf-nv-test.c
+++ b/drivers/pci/endpoint/functions/pci-epf-nv-test.c
@@ -16,7 +16,7 @@
 #include <linux/pci-epc.h>
 #include <linux/pci-epf.h>
  
-#define BAR0_SIZE SZ_64K
+#define BAR0_SIZE SZ_512M
  
 struct pci_epf_nv_test {
    struct pci_epf_header header;
@@ -30,14 +30,11 @@ static void pci_epf_nv_test_unbind(struct pci_epf *epf)
    struct pci_epf_nv_test *epfnv = epf_get_drvdata(epf);
    struct pci_epc *epc = epf->epc;
    struct device *cdev = epc->dev.parent;
-   struct iommu_domain *domain = iommu_get_domain_for_dev(cdev);
  
    pci_epc_stop(epc);
    pci_epc_clear_bar(epc, BAR_0);
-   vunmap(epfnv->bar0_ram_map);
-   iommu_unmap(domain, epfnv->bar0_iova, PAGE_SIZE);
-   iommu_dma_free_iova(cdev, epfnv->bar0_iova, BAR0_SIZE);
-   __free_pages(epfnv->bar0_ram_page, 1);
+   dma_free_coherent(cdev, BAR0_SIZE, epfnv->bar0_ram_map,
+             epfnv->bar0_iova);
 }
  
 static int pci_epf_nv_test_bind(struct pci_epf *epf)
@@ -47,7 +44,6 @@ static int pci_epf_nv_test_bind(struct pci_epf *epf)
    struct pci_epf_header *header = epf->header;
    struct device *fdev = &epf->dev;
    struct device *cdev = epc->dev.parent;
-   struct iommu_domain *domain = iommu_get_domain_for_dev(cdev);
    int ret;
  
    ret = pci_epc_write_header(epc, header);
@@ -56,60 +52,29 @@ static int pci_epf_nv_test_bind(struct pci_epf *epf)
        return ret;
    }
  
-   epfnv->bar0_ram_page = alloc_pages(GFP_KERNEL, 1);
-   if (!epfnv->bar0_ram_page) {
-       dev_err(fdev, "alloc_pages() failed\n");
-       ret = -ENOMEM;
-       goto fail;
-   }
-   dev_info(fdev, "BAR0 RAM phys: 0x%llx\n",
-        page_to_phys(epfnv->bar0_ram_page));
-
-   epfnv->bar0_iova = iommu_dma_alloc_iova(cdev, BAR0_SIZE,
-                       cdev->coherent_dma_mask);
-   if (!epfnv->bar0_iova) {
-       dev_err(fdev, "iommu_dma_alloc_iova() failed\n");
-       ret = -ENOMEM;
-       goto fail_free_pages;
-   }
-
-   dev_info(fdev, "BAR0 RAM IOVA: 0x%08llx\n", epfnv->bar0_iova);
-
-   ret = iommu_map(domain, epfnv->bar0_iova,
-           page_to_phys(epfnv->bar0_ram_page),
-           PAGE_SIZE, IOMMU_READ | IOMMU_WRITE);
-   if (ret) {
-       dev_err(fdev, "iommu_map(RAM) failed: %d\n", ret);
-       goto fail_free_iova;
-   }
-   epfnv->bar0_ram_map = vmap(&epfnv->bar0_ram_page, 1, VM_MAP,
-                  PAGE_KERNEL);
+   epfnv->bar0_ram_map = dma_alloc_coherent(cdev, BAR0_SIZE,
+                        &epfnv->bar0_iova, GFP_KERNEL);
    if (!epfnv->bar0_ram_map) {
-       dev_err(fdev, "vmap() failed\n");
+       dev_err(fdev, "dma_alloc_coherent() failed\n");
        ret = -ENOMEM;
-       goto fail_unmap_ram_iova;
+       return ret;
    }
-   dev_info(fdev, "BAR0 RAM virt: 0x%p\n", epfnv->bar0_ram_map);
+   dev_info(fdev, "BAR0 RAM IOVA: 0x%llx\n", epfnv->bar0_iova);
  
    ret = pci_epc_set_bar(epc, BAR_0, epfnv->bar0_iova, BAR0_SIZE,
                  PCI_BASE_ADDRESS_SPACE_MEMORY |
-                 PCI_BASE_ADDRESS_MEM_TYPE_32);
+                 PCI_BASE_ADDRESS_MEM_TYPE_32 |
+                 PCI_BASE_ADDRESS_MEM_PREFETCH);
    if (ret) {
        dev_err(fdev, "pci_epc_set_bar() failed: %d\n", ret);
-       goto fail_unmap_ram_virt;
+       goto fail_set_bar;
    }
  
    return 0;
  
-fail_unmap_ram_virt:
-   vunmap(epfnv->bar0_ram_map);
-fail_unmap_ram_iova:
-   iommu_unmap(domain, epfnv->bar0_iova, PAGE_SIZE);
-fail_free_iova:
-   iommu_dma_free_iova(cdev, epfnv->bar0_iova, BAR0_SIZE);
-fail_free_pages:
-   __free_pages(epfnv->bar0_ram_page, 1);
-fail:
+fail_set_bar:
+   dma_free_coherent(cdev, BAR0_SIZE, epfnv->bar0_ram_map,
+             epfnv->bar0_iova);
    return ret;
 }

Hi Wayne,

Thanks reply very much.

I used the patch before. But how could I read/write iova address?

Hi Wayne,

抱歉我用中文回覆,

目前透過這PATCH,我在RP端可以看到512MB的大小,也可以透過busybox devmem讀寫,
但在EP端目前只能得到IOVA地址,
但我目前完全不知道應該如何來讀寫此記憶體區塊。

我看了上面一些連結的文章,但我還是不太清楚該如何實作記憶體這方面的功能。
由於我在這問題上困擾很久,希望可以指點一些方向,我會非常感激不盡,謝謝。

Hi Wilson,

既然BAR0 size目前已經可以調整了 看你要不要開個新topic討論你的IOVA的問題
聽起來這跟BAR0 size好像也沒什麼直接關聯?

Hi Wayne,

非常感謝,我開了新topic.

Hi,

我想確認最後一件事情
請問只改動 BAR0_SIZE SZ_64K 的話你從RP端lspci有看到BAR0 size變動嗎?

Hi Wayne,

你是說如果維持原始的,會產生phys的原始code嗎?
如果只改BAR0_SIZE SZ_64K的話,RP端lspci可以看到變動,但實測會無法寫超過4K

無法寫超過4k的問題是跟你新開的問題一樣嗎?
還是不同問題?

我詳細說明一下我測試的幾個狀況

  1. 最原始的sample code,看起來只能讀寫4K (1個PAGE)。
  2. 延續1,最原始的sample code,如果只修改BAR0_SIZE,RP端lspci可以看到變化,但實測讀寫一樣只有4K。在RP端使用devmem讀取超過4K會出現bus error。

上述1和2由於有phys address,所以可以使用busybox devmem讀寫。

以下是加入patch
3. 加入Patch後,只剩下IOVA,沒有phys,無法直接使用devmem讀寫,
但在RP端可以讀取超過4K的位置而且不會bus error,所以我認為有確實成功增加大小,但由於無法讀寫EP端的IOVA,所以還無法真正證明有變大。

新開的Topic是延續問題3。

1 Like