pci_alloc_consistent ignores mask set by pci_set_consistent_dma_mask

Hi,

We have a PCIe device that does DMA transfers from device memory to system memory. Our device is 64-bit DMA capable, but fetches DMA descriptors from 32-bit bus addresses (i.e., it requires DMA descriptors to be available at bus addresses below 4GB).

As part of its initialization, the device driver indicates the device is 64-bit DMA capable by calling pci_set_dma_mask(dev, DMA_BIT_MASK(64)) (and checks the return value is 0). Then, it indicates the device can only address 32-bit consistent memory by calling pci_set_consistent_dma_mask(dev, DMA_BIT_MASK(32)) (and also checks the return value is 0).

To setup DMA transfers, the driver allocates one or more 4kB pages in consistent memory: pci_alloc_consistent(dev, size, &dmaAddress).

One of our customers is reporting issues on an NVIDIA Xavier module (Linux kernel 4.9.108-tegra). Our investigations have shown that the dmaAddress obtained from pci_alloc_consistent is above 4GB (e.g. 0x0000_0004_fdff_f000). Considering that pci_set_consistent_dma_mask(dev, DMA_BIT_MASK(32)) was successful, this should not happen. DMA addresses obtained from pci_alloc_consistent should be less than 4GB.

Is this a known issue? Is a fix available?

Thank you

You are right that it shouldn’t return >32-bit addresses when the mask is set for 32-bit addresses only.
We’ll investigate this more and update.

Any progress on this?

We are still investigating the issue.
Meanwhile, can you please use the following change to limit IOVA addresses to come from within 32-bit addressable region?

diff --git a/kernel-dts/tegra194-soc/tegra194-soc-base.dtsi b/kernel-dts/tegra194-soc/tegra194-soc-base.dtsi
index 186dc20d818f..c6d061115bfd 100644
--- a/kernel-dts/tegra194-soc/tegra194-soc-base.dtsi
+++ b/kernel-dts/tegra194-soc/tegra194-soc-base.dtsi
@@ -554,14 +554,14 @@
                        };
                        pcie0_as: pcie0 {
                                iova-start = <0x0 0x80000000>;
-                               iova-size = <0x4 0x7FFFFFFF>;
+                               iova-size = <0x0 0x7FFFFFFF>;
                                alignment = <0xFFFFF>;
                                num-pf-page = <0>;
                                gap-page = <1>;
                        };
                        pcie1_as: pcie1 {
                                iova-start = <0x0 0x80000000>;
-                               iova-size = <0x1F 0x7FFFFFFF>;
+                               iova-size = <0x0 0x7FFFFFFF>;
                                alignment = <0xFFFFF>;
                                num-pf-page = <0>;
                                gap-page = <1>;
@@ -582,14 +582,14 @@
                        };
                        pcie4_as: pcie4 {
                                iova-start = <0x0 0x80000000>;
-                               iova-size = <0x4 0x7FFFFFFF>;
+                               iova-size = <0x0 0x7FFFFFFF>;
                                alignment = <0xFFFFF>;
                                num-pf-page = <0>;
                                gap-page = <1>;
                        };
                        pcie5_as: pcie5 {
                                iova-start = <0x0 0x80000000>;
-                               iova-size = <0x4 0x7FFFFFFF>;
+                               iova-size = <0x0 0x7FFFFFFF>;
                                alignment = <0xFFFFF>;
                                num-pf-page = <0>;
                                gap-page = <1>;

We checked internally and this kind of issue is not observed.
Can you please confirm that you are checking DMA/bus address (IOVA) and not CPU_VA returned by pci_alloc_consistent() API?

Yes, of course we check the DMA address, not the virtual address.

Is it possible for you guys to reproduce this issue with a standard off-the-shelf PCIe card and share repro procedure?

We have the same issue on the Jetson TX2 with R32.1 (kernel 4.9) so we thought it could be useful to share this information.
With R28.1 (kernel 4.4) there is no issue on the TX2.

Following the suggestion for the Xavier, we could work around the problem on the TX2 with R32.1 by limiting pcie_as.iova-start value to <0x0 0x7fffffff> in the device-tree.

This means these kernel allocation functions rely on the size of the region and do not always take into account the coherent_dma_mask the user could set.

On R32.1 we have the following sequence:

pci_alloc_consistent()
  dma_alloc_coherent()
    dma_alloc_attrs()
      __iommu_alloc_attrs(): coherent_dma_mask is not always deeper taken into account

Also, we had a look at the history of drivers/iommu/dma-iommu.c in
http://nv-tegra.nvidia.com/gitweb/?p=linux-4.9.git;a=history;f=drivers/iommu/dma-iommu.c;h=064c95bfd6a5abba4984da27adb7bc81814c5a59;hb=064c95bfd6a5abba4984da27adb7bc81814c5a59
and noticed that the following patch was perhaps missing:
https://github.com/torvalds/linux/commit/f51d7bb79c1124f7f02e9f472ef935eba13bca8e#diff-dee8f6b6d2c17383f4d78a787444b67e

We will check and get back to you.

We investigated the issue on Jetson-Xavier but couldn’t repro it.
@grrrr1433, Can you please try with the following patch on your system to see if you could hit the error case? You can just put this at the start/appropriate place in your device driver

dma_addr_t dma_addr;                                                        
u64 *p_iova = NULL;                                                         
u32 retry = 2000000000;

rc = dma_set_mask(&pdev->dev, DMA_BIT_MASK(64));
if (rc) {          
    dev_err(&pdev->dev, "32-bit DMA enable failed\n");
    return rc;
}

rc = dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32));
if (rc) {
    dev_err(&pdev->dev, "32-bit consistent DMA enable failed\n");
    return rc;
}

while (retry--) {
    p_iova = dma_alloc_coherent(&pdev->dev, 1 * 1024 * 1024, &dma_addr, GFP_KERNEL);
    pr_info("---> IOVA = %llx\n", dma_addr);
    if (!p_iova) {
        pr_alert("----no more memory for testing\n");
        break;
    } else if ((u64)dma_addr & 0xFFFFFFFF00000000) {
        pr_info("---> detected a 64-bit IOVA (%llx)\n", dma_addr);
        break;
    }
}

We investigated the issue on Jetson-Xavier but couldn’t repro it.
@grrrr1433, Can you please try with the following patch on your system to see if you could hit the error case? You can just put this at the start/appropriate place in your device driver

dma_addr_t dma_addr;                                                        
u64 *p_iova = NULL;                                                         
u32 retry = 2000000000;

rc = dma_set_mask(&pdev->dev, DMA_BIT_MASK(64));
if (rc) {          
    dev_err(&pdev->dev, "32-bit DMA enable failed\n");
    return rc;
}

rc = dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32));
if (rc) {
    dev_err(&pdev->dev, "32-bit consistent DMA enable failed\n");
    return rc;
}

while (retry--) {
    p_iova = dma_alloc_coherent(&pdev->dev, 1 * 1024 * 1024, &dma_addr, GFP_KERNEL);
    pr_info("---> IOVA = %llx\n", dma_addr);
    if (!p_iova) {
        pr_alert("----no more memory for testing\n");
        break;
    } else if ((u64)dma_addr & 0xFFFFFFFF00000000) {
        pr_info("---> detected a 64-bit IOVA (%llx)\n", dma_addr);
        break;
    }
}

We investigated the issue on Jetson-Xavier but couldn’t repro it.
@grrrr1433, Can you please try with the following patch on your system to see if you could hit the error case? You can just put this at the start/appropriate place in your device driver

dma_addr_t dma_addr;                                                        
u64 *p_iova = NULL;                                                         
u32 retry = 2000000000;

rc = dma_set_mask(&pdev->dev, DMA_BIT_MASK(64));
if (rc) {          
    dev_err(&pdev->dev, "64-bit DMA enable failed\n");
    return rc;
}

rc = dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32));
if (rc) {
    dev_err(&pdev->dev, "32-bit consistent DMA enable failed\n");
    return rc;
}

while (retry--) {
    p_iova = dma_alloc_coherent(&pdev->dev, 1 * 1024 * 1024, &dma_addr, GFP_KERNEL);
    pr_info("---> IOVA = %llx\n", dma_addr);
    if (!p_iova) {
        pr_alert("----no more memory for testing\n");
        break;
    } else if ((u64)dma_addr & 0xFFFFFFFF00000000) {
        pr_info("---> detected a 64-bit IOVA (%llx)\n", dma_addr);
        break;
    }
}