Xavier pcie endpoint share memory size

I am use PCIe to communicate between two Xaviers now, but I have some problem.

According to document 《NVIDIA Jetson AGX Xavier PCIe Endpoint Software for L4T》, I set the endpoint xavier as follows:

cd /sys/kernel/config/pci_ep/
mkdir functions/pci_epf_nv_test/func1
echo 0x10de > functions/pci_epf_nv_test/func1/vendorid
echo 0x0001 > functions/pci_epf_nv_test/func1/deviceid
ln -s functions/pci_epf_nv_test/func1 controllers/141a0000.pcie_ep/
echo 1 > controllers/141a0000.pcie_ep/start

the kernel log as follows:

dmesg|grep pci_epf_nv_test
[ 38.338101] pci_epf_nv_test pci_epf_nv_test.0: BAR0 RAM phys: 0x4307b8000
[ 38.338113] pci_epf_nv_test pci_epf_nv_test.0: BAR0 RAM IOVA: 0xffff0000
[ 38.338138] pci_epf_nv_test pci_epf_nv_test.0: BAR0 RAM virt: 0xffffff800b3dc000

I set the root xavier as follows:

setpci -s 0005:01:00.0 COMMAND=0x02

the lspci log as follows:

lspci -v
0005:01:00.0 RAM memory: NVIDIA Corporation Device 0001
Flags: fast devsel, IRQ 255
Memory at 3a300000 (32-bit, non-prefetchable) [disabled] 
Memory at 1c00000000 (64-bit, prefetchable) [disabled] 

But the Operational share memory size is only 1K, on root xavier ,I enter the command as follows:

nvidia@nvidia:$sudo busybox devmem 0x3a300000 32 0x12345678
nvidia@nvidia:$sudo busybox devmem 0x3a300000
0x12345678

nvidia@nvidia:$sudo busybox devmem 0x3a300ffc 32 0x87654321
nvidia@nvidia:$sudo busybox devmem 0x3a300ffc
0x87654321

nvidia@nvidia:$sudo busybox devmem 0x3a301000 32 0x87654321
nvidia@nvidia:$sudo busybox devmem 0x3a301000
0xFFFFFFFF

How to increase memory size?

Thanks.

Please apply the following patches (if they are not present already in your codebase) on Root Port side.
Patch-1:

diff --git a/kernel-dts/tegra194-soc/tegra194-soc-pcie.dtsi b/kernel-dts/tegra194-soc/tegra194-soc-pcie.dtsi
index 2333198..cf2176e 100644
--- a/kernel-dts/tegra194-soc/tegra194-soc-pcie.dtsi
+++ b/kernel-dts/tegra194-soc/tegra194-soc-pcie.dtsi
@@ -572,8 +572,8 @@
  
        bus-range = <0x0 0xff>;
        ranges = <0x81000000 0x0 0x38100000 0x0 0x38100000 0x0 0x00100000      /* downstream I/O (1MB) */
-             0x82000000 0x0 0x38200000 0x0 0x38200000 0x0 0x01E00000      /* non-prefetchable memory (30MB) */
-             0xc2000000 0x18 0x00000000 0x18 0x00000000 0x4 0x00000000>;  /* prefetchable memory (16GB) */
+             0x82000000 0x0 0x40000000 0x1B 0x40000000 0x0 0xC0000000     /* non-prefetchable memory (3GB) */
+             0xc2000000 0x18 0x00000000 0x18 0x00000000 0x3 0x40000000>;  /* prefetchable memory (12GB) */
  
        nvidia,cfg-link-cap-l1sub = <0x1c4>;
        nvidia,cap-pl16g-status = <0x174>;
@@ -640,8 +640,8 @@
  
        bus-range = <0x0 0xff>;
        ranges = <0x81000000 0x0 0x30100000 0x0 0x30100000 0x0 0x00100000      /* downstream I/O (1MB) */
-             0x82000000 0x0 0x30200000 0x0 0x30200000 0x0 0x01E00000      /* non-prefetchable memory (30MB) */
-             0xc2000000 0x12 0x00000000 0x12 0x00000000 0x0 0x40000000>;  /* prefetchable memory (1GB) */
+             0x82000000 0x0 0x40000000 0x12 0x30000000 0x0 0x10000000     /* non-prefetchable memory (256MB) */
+             0xc2000000 0x12 0x00000000 0x12 0x00000000 0x0 0x30000000>;  /* prefetchable memory (768MB) */
  
        nvidia,cfg-link-cap-l1sub = <0x194>;
        nvidia,cap-pl16g-status = <0x164>;
@@ -707,8 +707,8 @@
  
        bus-range = <0x0 0xff>;
        ranges = <0x81000000 0x0 0x32100000 0x0 0x32100000 0x0 0x00100000      /* downstream I/O (1MB) */
-             0x82000000 0x0 0x32200000 0x0 0x32200000 0x0 0x01E00000      /* non-prefetchable memory (30MB) */
-             0xc2000000 0x12 0x40000000 0x12 0x40000000 0x0 0x40000000>;  /* prefetchable memory (1GB) */
+             0x82000000 0x0 0x40000000 0x12 0x70000000 0x0 0x10000000     /* non-prefetchable memory (256MB) */
+             0xc2000000 0x12 0x40000000 0x12 0x40000000 0x0 0x30000000>;  /* prefetchable memory (768MB) */
  
        nvidia,cfg-link-cap-l1sub = <0x194>;
        nvidia,cap-pl16g-status = <0x164>;
@@ -774,8 +774,8 @@
  
        bus-range = <0x0 0xff>;
        ranges = <0x81000000 0x0 0x34100000 0x0 0x34100000 0x0 0x00100000      /* downstream I/O (1MB) */
-             0x82000000 0x0 0x34200000 0x0 0x34200000 0x0 0x01E00000      /* non-prefetchable memory (30MB) */
-             0xc2000000 0x12 0x80000000 0x12 0x80000000 0x0 0x40000000>;  /* prefetchable memory (1GB) */
+             0x82000000 0x0 0x40000000 0x12 0xB0000000 0x0 0x10000000     /* non-prefetchable memory (256MB) */
+             0xc2000000 0x12 0x80000000 0x12 0x80000000 0x0 0x30000000>;  /* prefetchable memory (768MB) */
  
        nvidia,cfg-link-cap-l1sub = <0x194>;
        nvidia,cap-pl16g-status = <0x164>;
@@ -841,8 +841,8 @@
  
        bus-range = <0x0 0xff>;
        ranges = <0x81000000 0x0 0x36100000 0x0 0x36100000 0x0 0x00100000      /* downstream I/O (1MB) */
-             0x82000000 0x0 0x36200000 0x0 0x36200000 0x0 0x01E00000      /* non-prefetchable memory (30MB) */
-             0xc2000000 0x14 0x00000000 0x14 0x00000000 0x4 0x00000000>;  /* prefetchable memory (16GB) */
+             0x82000000 0x0 0x40000000 0x17 0x40000000 0x0 0xC0000000      /* non-prefetchable memory (3GB) */
+             0xc2000000 0x14 0x00000000 0x14 0x00000000 0x3 0x40000000>;  /* prefetchable memory (12GB) */
  
        nvidia,cfg-link-cap-l1sub = <0x1b0>;
        nvidia,cap-pl16g-status = <0x174>;
@@ -913,8 +913,8 @@
  
        bus-range = <0x0 0xff>;
        ranges = <0x81000000 0x0 0x3a100000 0x0 0x3a100000 0x0 0x00100000      /* downstream I/O (1MB) */
-             0x82000000 0x0 0x3a200000 0x0 0x3a200000 0x0 0x01E00000      /* non-prefetchable memory (30MB) */
-             0xc2000000 0x1c 0x00000000 0x1c 0x00000000 0x4 0x00000000>;  /* prefetchable memory (16GB) */
+             0x82000000 0x0 0x40000000 0x1f 0x40000000 0x0 0xC0000000     /* non-prefetchable memory (3GB) */
+             0xc2000000 0x1c 0x00000000 0x1c 0x00000000 0x3 0x40000000>;  /* prefetchable memory (12GB) */
  
        nvidia,cfg-link-cap-l1sub = <0x1c4>;
        nvidia,cap-pl16g-status = <0x174>;

Patch-2:

diff --git a/drivers/pci/dwc/pcie-tegra.c b/drivers/pci/dwc/pcie-tegra.c
index d118cf9..8593dee 100644
--- a/drivers/pci/dwc/pcie-tegra.c
+++ b/drivers/pci/dwc/pcie-tegra.c
@@ -2959,12 +2959,14 @@
            /* program iATU for Non-prefetchable MEM mapping */
            outbound_atu(pp, PCIE_ATU_REGION_INDEX3,
                     PCIE_ATU_TYPE_MEM, win->res->start,
-                    win->res->start, resource_size(win->res));
+                    win->res->start - win->offset,
+                    resource_size(win->res));
        } else if (win->res->flags & IORESOURCE_MEM) {
            /* program iATU for Non-prefetchable MEM mapping */
            outbound_atu(pp, PCIE_ATU_REGION_INDEX2,
                     PCIE_ATU_TYPE_MEM, win->res->start,
-                    win->res->start, resource_size(win->res));
+                    win->res->start - win->offset,
+                    resource_size(win->res));
        }
    }

Apply following patch on endpoint side
Patch-1:

diff --git a/drivers/pci/endpoint/functions/pci-epf-nv-test.c b/drivers/pci/endpoint/functions/pci-epf-nv-test.c
index 8b2a1dcecab8..d3496a199842 100644
--- a/drivers/pci/endpoint/functions/pci-epf-nv-test.c
+++ b/drivers/pci/endpoint/functions/pci-epf-nv-test.c
@@ -16,7 +16,7 @@
 #include <linux/pci-epc.h>
 #include <linux/pci-epf.h>
 
-#define BAR0_SIZE SZ_64K
+#define BAR0_SIZE SZ_512M
 
 struct pci_epf_nv_test {
 	struct pci_epf_header header;
@@ -30,14 +30,11 @@ static void pci_epf_nv_test_unbind(struct pci_epf *epf)
 	struct pci_epf_nv_test *epfnv = epf_get_drvdata(epf);
 	struct pci_epc *epc = epf->epc;
 	struct device *cdev = epc->dev.parent;
-	struct iommu_domain *domain = iommu_get_domain_for_dev(cdev);
 
 	pci_epc_stop(epc);
 	pci_epc_clear_bar(epc, BAR_0);
-	vunmap(epfnv->bar0_ram_map);
-	iommu_unmap(domain, epfnv->bar0_iova, PAGE_SIZE);
-	iommu_dma_free_iova(cdev, epfnv->bar0_iova, BAR0_SIZE);
-	__free_pages(epfnv->bar0_ram_page, 1);
+	dma_free_coherent(cdev, BAR0_SIZE, epfnv->bar0_ram_map,
+			  epfnv->bar0_iova);
 }
 
 static int pci_epf_nv_test_bind(struct pci_epf *epf)
@@ -47,7 +44,6 @@ static int pci_epf_nv_test_bind(struct pci_epf *epf)
 	struct pci_epf_header *header = epf->header;
 	struct device *fdev = &epf->dev;
 	struct device *cdev = epc->dev.parent;
-	struct iommu_domain *domain = iommu_get_domain_for_dev(cdev);
 	int ret;
 
 	ret = pci_epc_write_header(epc, header);
@@ -56,60 +52,29 @@ static int pci_epf_nv_test_bind(struct pci_epf *epf)
 		return ret;
 	}
 
-	epfnv->bar0_ram_page = alloc_pages(GFP_KERNEL, 1);
-	if (!epfnv->bar0_ram_page) {
-		dev_err(fdev, "alloc_pages() failed\n");
-		ret = -ENOMEM;
-		goto fail;
-	}
-	dev_info(fdev, "BAR0 RAM phys: 0x%llx\n",
-		 page_to_phys(epfnv->bar0_ram_page));
-
-	epfnv->bar0_iova = iommu_dma_alloc_iova(cdev, BAR0_SIZE,
-						cdev->coherent_dma_mask);
-	if (!epfnv->bar0_iova) {
-		dev_err(fdev, "iommu_dma_alloc_iova() failed\n");
-		ret = -ENOMEM;
-		goto fail_free_pages;
-	}
-
-	dev_info(fdev, "BAR0 RAM IOVA: 0x%08llx\n", epfnv->bar0_iova);
-
-	ret = iommu_map(domain, epfnv->bar0_iova,
-			page_to_phys(epfnv->bar0_ram_page),
-			PAGE_SIZE, IOMMU_READ | IOMMU_WRITE);
-	if (ret) {
-		dev_err(fdev, "iommu_map(RAM) failed: %d\n", ret);
-		goto fail_free_iova;
-	}
-	epfnv->bar0_ram_map = vmap(&epfnv->bar0_ram_page, 1, VM_MAP,
-				   PAGE_KERNEL);
+	epfnv->bar0_ram_map = dma_alloc_coherent(cdev, BAR0_SIZE,
+						 &epfnv->bar0_iova, GFP_KERNEL);
 	if (!epfnv->bar0_ram_map) {
-		dev_err(fdev, "vmap() failed\n");
+		dev_err(fdev, "dma_alloc_coherent() failed\n");
 		ret = -ENOMEM;
-		goto fail_unmap_ram_iova;
+		return ret;
 	}
-	dev_info(fdev, "BAR0 RAM virt: 0x%p\n", epfnv->bar0_ram_map);
+	dev_info(fdev, "BAR0 RAM IOVA: 0x%llx\n", epfnv->bar0_iova);
 
 	ret = pci_epc_set_bar(epc, BAR_0, epfnv->bar0_iova, BAR0_SIZE,
 			      PCI_BASE_ADDRESS_SPACE_MEMORY |
-			      PCI_BASE_ADDRESS_MEM_TYPE_32);
+			      PCI_BASE_ADDRESS_MEM_TYPE_32 |
+			      PCI_BASE_ADDRESS_MEM_PREFETCH);
 	if (ret) {
 		dev_err(fdev, "pci_epc_set_bar() failed: %d\n", ret);
-		goto fail_unmap_ram_virt;
+		goto fail_set_bar;
 	}
 
 	return 0;
 
-fail_unmap_ram_virt:
-	vunmap(epfnv->bar0_ram_map);
-fail_unmap_ram_iova:
-	iommu_unmap(domain, epfnv->bar0_iova, PAGE_SIZE);
-fail_free_iova:
-	iommu_dma_free_iova(cdev, epfnv->bar0_iova, BAR0_SIZE);
-fail_free_pages:
-	__free_pages(epfnv->bar0_ram_page, 1);
-fail:
+fail_set_bar:
+	dma_free_coherent(cdev, BAR0_SIZE, epfnv->bar0_ram_map,
+			  epfnv->bar0_iova);
 	return ret;
 }

With these, you should be able to increase the BAR size to 512 MB.

Hi, vidyas:

I changed the code as you suggested on Root Port side. But It not good. The Operational share memory size is only 1K, on root xavier.

The log as follows:

lspci -vv
0005:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 38
	Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
	Memory behind bridge: 40000000-401fffff
	Prefetchable memory behind bridge: 0000001c00000000-0000001c000fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: <access denied>
	Kernel driver in use: pcieport

0005:01:00.0 RAM memory: NVIDIA Corporation Device 1ad5
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 595
	Region 0: Memory at 1f40100000 (32-bit, non-prefetchable) 
	Region 2: Memory at 1c00000000 (64-bit, prefetchable) 
	Region 4: Memory at 1f40000000 (64-bit, non-prefetchable) 
	Capabilities: <access denied>
	Kernel driver in use: tegra_ep_mem
nvidia@nvidia-desktop:~$ sudo busybox devmem 0x1f40100000 32 0x01
nvidia@nvidia-desktop:~$ sudo busybox devmem 0x1f40100000 
0x00000001
nvidia@nvidia-desktop:~$ 
nvidia@nvidia-desktop:~$ sudo busybox devmem 0x1f40100ffc 32 0x02
nvidia@nvidia-desktop:~$ sudo busybox devmem 0x1f40100ffc 
0x00000002
nvidia@nvidia-desktop:~$ 
nvidia@nvidia-desktop:~$ sudo busybox devmem 0x1f40101000 32 0x03
nvidia@nvidia-desktop:~$ sudo busybox devmem 0x1f40101000 
0xFFFFFFFF

The shared memory size is “Region 0: Memory at 1f40100000” add 1K. I need write/read large memory size.

can you help me?

Thanks.

What about the patches I asked you to apply on the endpoint system? didn’t you apply then?

Thank you for your reply.

you said : “Please apply the following patches (if they are not present already in your codebase) on Root Port side.”

I patch up on root system, not on endpoint system.
what should I do? patch up on root or enpoint system? or patch up on root and endpoint system?

Thanks.

Please check my comment #2 once again. I have one patch to be applied on endpoint as well. Search for “Apply following patch on endpoint side”

sorry.

This is my mistake. I didn’t see it.

I will try it now.

Thank you very much! ^_^

Hi, vidyas:
Thank you for your reply。

I patch up the code as your suggestion.
The endpoint system log as follows:

pci_epf_nv_test pci_epf_nv_test.0: BAR0 RAM IOVA: 0xe0000000

The root system log as follows:

0005:01:00.0 RAM memory: NVIDIA Corporation Device 1ad5
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 595
	Region 0: Memory at 1f40000000 (32-bit, non-prefetchable) 
	Region 2: Memory at 1c00000000 (64-bit, prefetchable) 
	Region 4: Memory at 1f60000000 (64-bit, non-prefetchable) 
	Capabilities: <access denied>
	Kernel driver in use: tegra_ep_mem

But, What is the shared memory address?

I enter command on endpoint system:

sudo busybox devmem 0xe0000000 32 0x12345678
sudo busybox devmem 0xe0000000
the result is 0x12345678

then, I enter command on root system:

sudo busybox devmem 0x1f40000000 
the result is 0x00000000

sudo busybox devmem 0x1c00000000 
the result is 0x00000000

sudo busybox devmem 0x1f60000000 
the result is 0x00000688

can you help me?

The “BAR0 RAM IOVA: 0xe0000000” print seen on endpoint system refers to (as the print indicates) IOVA of the shared memory region i.e. It is the address used by the PCIe IP in endpoint system to access the memory and please note that this is not equivalent to the physical address (as SMMU is enabled for PCIe in endpoint system). Also please note that the input to ‘busybox devmem’ should be the physical address in the system and not the IOVA (or CPU-VA for that matter). So, effectively when you were trying to access 0xe0000000 through ‘busybox devmem’ command, what you were actually accessing was the real physical address at that location. Since memory does exist at that location, you were able to write to it and read back the same data, but it is not equivalent to the shared memory physical address. To be able to access shared memory through ‘busybox devmem’ command, you need the physical memory equivalent of 0xe0000000, but since we used dma_alloc_coherent() to allocate the memory, we get only IOVA and CPU-VA but not physical address (in fact, physical memory represented by this IOVA may be fragmented). Ideally one doesn’t really have to get to know the physical address, because either the drivers access that memory through CPU-VA or that memory is accessed by PCIe-IP for which IOVA is enough. ‘busybox devmem’ is not a real use case as such.
Coming to root port side,
0x1f40000000 - refers to the shared memory BAR (region-0)
0x1c00000000 - refers to the MSI-X table and (region-2)
0x1f60000000 - refers to the DMA engine (of EP) registers (region-4)
Also, all these above addresses are physical addresses in host system memory. So, you can use ‘busybox devmem’ on region-0 and access the shared memory or can be ioremap()'ed and used in a driver.
I hope this clarifies things.

Hi, vidyas:
Thank you for your reply.

I know why I was wrong. on endpoint system side, the 0xe0000000 is the shared memory physical address. But on the root system side, the 0x1f40000000 is the shared memory physical address.

On root system side, I can write/read the shared memory physical address data through mmap on user space.

can you tell what should I do to write/read the shared memory physical address data on endpoint system side on user space?

Thank you very much.

error commit

can you tell what should I do to write/read the shared memory physical address data on endpoint system side on user space?
There is no way at this point in time as this is not required for any real use case? What is the use case you have?
One way to achieve this is to use a different set of APIs (i.e. the original code present in pci-epf-nv-test.c file where physical memory is allocated first and then mapped to both CPU-VA and IOVA. But there, the issue is that we can’t allocate large sizes of memory)

Hi, vidyas:
Thank you for your reply.

I need to transfer big data. what should I do?

What is your use case? You can use virtual ethernet over PCIe connection and do data transfer. I don’t understand what is the reason behind using “busybox devmem” here.

Hi, vidyas:
Thank you for your reply.

I use virtual ethernet over PCIe connection and do data transfer, but the speed is only 100Mb/s.
I use iperf3 to test.
The speed is very slowly.
And the cpu used is very high.

Thank you very much.

Did you follow the correct boot sequence?
i.e.
→ Boot EP system
→ Enter above commands
→ Boot RP system

yes, the boot sequence is correct.

  1. Boot EP system
  2. enter command:
cd /sys/kernel/config/pci_ep/
mkdir functions/pci_epf_tvnet/func1
ln -s functions/pci_epf_tvnet/func1 controllers/141a0000.pcie_ep/
echo 1 > controllers/141a0000.pcie_ep/start
  1. Boot RP system

Can you please attach ‘sudo lspci -vvvv’ output of RP system?

nvidia@nvidia-desktop:~$ sudo lspci -vvvv
[sudo] password for nvidia: 
0001:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad2 (rev a1) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 34
	Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
	I/O behind bridge: 00000000-00000fff
	Memory behind bridge: 30200000-302fffff
	Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [70] Express (v2) Root Port (Slot-), MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0
			ExtTag- RBE+
		DevCtl:	Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 256 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <1us, L1 <64us
			ClockPM- Surprise+ LLActRep+ BwNot+ ASPMOptComp+
		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt+ AutBWInt-
		LnkSta:	Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt+
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible+
		RootCap: CRSVisible+
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Not Supported ARIFwd-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [b0] MSI-X: Enable- Count=1 Masked-
		Vector table: BAR=0 offset=00000000
		PBA: BAR=0 offset=00000000
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
	Capabilities: [148 v1] #19
	Capabilities: [158 v1] #26
	Capabilities: [17c v1] #27
	Capabilities: [190 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1- L1_PM_Substates+
			  PortCommonModeRestoreTime=60us PortTPowerOnTime=40us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=60us
		L1SubCtl2: T_PwrOn=40us
	Capabilities: [1a0 v1] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
	Capabilities: [2a0 v1] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
	Capabilities: [2d8 v1] #25
	Capabilities: [2e4 v1] Precision Time Measurement
		PTMCap: Requester:- Responder:+ Root:+
		PTMClockGranularity: 16ns
		PTMControl: Enabled:- RootSelected:-
		PTMEffectiveGranularity: Unknown
	Capabilities: [2f0 v1] Vendor Specific Information: ID=0004 Rev=1 Len=054 <?>
	Kernel driver in use: pcieport

0001:01:00.0 SATA controller: Marvell Technology Group Ltd. Device 9171 (rev 13) (prog-if 01 [AHCI 1.0])
	Subsystem: Marvell Technology Group Ltd. Device 9171
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 563
	Region 0: I/O ports at 100010 
	Region 1: I/O ports at 100020 
	Region 2: I/O ports at 100018 
	Region 3: I/O ports at 100024 
	Region 4: I/O ports at 100000 
	Region 5: Memory at 30210000 (32-bit, non-prefetchable) 
	Expansion ROM at 30200000 [disabled] 
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit-
		Address: e4400000  Data: 0000
	Capabilities: [70] Express (v2) Legacy Endpoint, MSI 00
		DevCap:	MaxPayload 512 bytes, PhantFunc 0, Latency L0s <1us, L1 <64us
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 256 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <512ns, L1 <64us
			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
	Kernel driver in use: ahci

0005:00:00.0 PCI bridge: NVIDIA Corporation Device 1ad0 (rev a1) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 38
	Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
	I/O behind bridge: 0000f000-00000fff
	Memory behind bridge: 3a200000-3a7fffff
	Prefetchable memory behind bridge: 0000001c00000000-0000001c000fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
		Address: 0000000000000000  Data: 0000
		Masking: 00000000  Pending: 00000000
	Capabilities: [70] Express (v2) Root Port (Slot-), MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0
			ExtTag- RBE+
		DevCtl:	Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 256 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 16GT/s, Width x8, ASPM not supported, Exit Latency L0s <1us, L1 <64us
			ClockPM- Surprise+ LLActRep+ BwNot+ ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt+ AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible+
		RootCap: CRSVisible+
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Not Supported ARIFwd-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
		LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [b0] MSI-X: Enable- Count=8 Masked-
		Vector table: BAR=2 offset=00000000
		PBA: BAR=2 offset=00010000
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
	Capabilities: [148 v1] #19
	Capabilities: [168 v1] #26
	Capabilities: [190 v1] #27
	Capabilities: [1c0 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1- L1_PM_Substates+
			  PortCommonModeRestoreTime=60us PortTPowerOnTime=40us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=60us
		L1SubCtl2: T_PwrOn=40us
	Capabilities: [1d0 v1] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
	Capabilities: [2d0 v1] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
	Capabilities: [308 v1] #25
	Capabilities: [314 v1] Precision Time Measurement
		PTMCap: Requester:+ Responder:+ Root:+
		PTMClockGranularity: 16ns
		PTMControl: Enabled:- RootSelected:-
		PTMEffectiveGranularity: Unknown
	Capabilities: [320 v1] Vendor Specific Information: ID=0004 Rev=1 Len=054 <?>
	Kernel driver in use: pcieport

0005:01:00.0 Network controller: NVIDIA Corporation Device 2296
	Subsystem: NVIDIA Corporation Device 0000
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 38
	Region 0: Memory at 3a400000 (32-bit, non-prefetchable) 
	Region 2: Memory at 1c00000000 (64-bit, prefetchable) 
	Region 4: Memory at 3a200000 (64-bit, non-prefetchable) 
	Capabilities: [40] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit-
		Address: 00000000  Data: 0000
		Masking: 00000000  Pending: 00000000
	Capabilities: [70] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 256 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x8, ASPM not supported, Exit Latency L0s <1us, L1 <64us
			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Not Supported
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
		LnkCtl2: Target Link Speed: 16GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [b0] MSI-X: Enable+ Count=8 Masked-
		Vector table: BAR=2 offset=00000000
		PBA: BAR=2 offset=00010000
	Capabilities: [100 v2] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
	Capabilities: [148 v1] #19
	Capabilities: [168 v1] #26
	Capabilities: [190 v1] #27
	Capabilities: [1b8 v1] Latency Tolerance Reporting
		Max snoop latency: 0ns
		Max no snoop latency: 0ns
	Capabilities: [1c0 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1- L1_PM_Substates+
			  PortCommonModeRestoreTime=60us PortTPowerOnTime=40us
		L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
			   T_CommonMode=0us
		L1SubCtl2: T_PwrOn=40us
	Capabilities: [1d0 v1] Vendor Specific Information: ID=0002 Rev=4 Len=100 <?>
	Capabilities: [2d0 v1] Vendor Specific Information: ID=0001 Rev=1 Len=038 <?>
	Capabilities: [308 v1] #25
	Capabilities: [314 v1] Precision Time Measurement
		PTMCap: Requester:+ Responder:+ Root:+
		PTMClockGranularity: 16ns
		PTMControl: Enabled:- RootSelected:-
		PTMEffectiveGranularity: Unknown
	Capabilities: [320 v1] Vendor Specific Information: ID=0003 Rev=1 Len=054 <?>
	Kernel driver in use: tvnet

nvidia@nvidia-desktop:~$

Hi, vidyas:

can you help me?

I use virtual ethernet over PCIe connection and do data transfer, but the speed is only 120Mb/s.

Thank you very very very much.