Dma_map_sg failed when call cudaHostAlloc on amd cpu and 4.15.112 linux kernel machine

my machine info:

root@server-wzj:~# cat /proc/version
Linux version 4.15.0-112-generic (buildd@lcy01-amd64-027) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020
root@iluvatar-ae-server-wzj:~# free -g
              total        used        free      shared  buff/cache   available
Mem:             31           0          14           0          15          29
Swap:             0           0           0
root@server-wzj:~#
root@server-wzj:~# lsb_release -a
LSB Version:    core-9.20170808ubuntu1-noarch:printing-9.20170808ubuntu1-noarch:security-9.20170808ubuntu1-noarch
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.6 LTS
Release:        18.04
Codename:       bionic
root@server-wzj:~# nvidia-smi
Thu May 25 01:26:40 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02              Driver Version: 530.30.02    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla T4                        Off| 00000000:C1:00.0 Off |                    0 |
| N/A   54C    P0               30W /  70W|      2MiB / 15360MiB |      7%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
+----------------------------------------------------------------------------------------+
| Processes:                 
||  GPU   GI   CI        PID   Type   Process name      GPU Memory ||        ID   ID      Usage      ||=======================================================================================|
|  No running processes found
                 |
+---------------------------------------------------------------------------------------+
root@server-wzj:~#

Modify cuda sample simpleZeroCopy as the following patch.

root@server-wzj:~/zhanged/cuda-samples/Samples/0_Introducti
on/simpleZeroCopy# git diff
diff --git a/Samples/0_Introduction/simpleZeroCopy/simpleZeroCopy.cu b/Samples/0_Introduction/simpleZeroCopy/simpleZeroCopy.cu
index bb23d0a..051158f 100644
--- a/Samples/0_Introduction/simpleZeroCopy/simpleZeroCopy.cu
+++ b/Samples/0_Introduction/simpleZeroCopy/simpleZeroCopy.cu
@@ -154,7 +154,8 @@ int main(int argc, char **argv) {
   /* Allocate mapped CPU memory. */

   nelem = 1048576;
-  bytes = nelem * sizeof(float);
+  bytes = 5000000000 / 2;
+  //bytes = nelem * sizeof(float);

   if (bPinGenericMemory) {
 #if CUDART_VERSION >= 4000

Run simpleZeroCopy:

root@server-wzj:~/zhanged/cuda-samples/Samples/0_Introduction/simpleZeroCopy# ./a.out
  Device 0: <          Turing >, Compute SM 7.5 detected
> Using CUDA Host Allocated (cudaHostAlloc)
CUDA error at simpleZeroCopy.cu:180 code=304(cudaErrorOperatingSystem) "cudaHostAlloc((void **)&b, bytes, flags)"
\

dmesg report:

[Wed May 24 23:58:48 2023] 0000:c1:00.0: IOMMU mapping error in map_sg (io-pages: 610352)
[Wed May 24 23:58:48 2023] NVRM: 0000:c1:00.0: Failed to create a DMA mapping!"
[Wed May 24 23:59:14 2023] watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [a.out:19506]
[Wed May 24 23:59:14 2023] Modules linked in: nvidia_uvm(OE) nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) vfio_iommu_type1 vfio xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack br_netfilter bridge stp llc aufs overlay nls_iso8859_1 edac_mce_amd rndis_host kvm_amd cdc_ether kvm usbnet mii input_leds joydev irqbypass ipmi_ssif ipmi_si ipmi_devintf k10temp shpchp mac_hid ipmi_msghandler sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi parport_pc ppdev lp parport bi_driver(OE) ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq
[Wed May 24 23:59:14 2023]  libcrc32c raid1 raid0 multipath linear crct10dif_pclmul ast crc32_pclmul i2c_algo_bit ghash_clmulni_intel ttm pcbc aesni_intel drm_kms_helper hid_generic aes_x86_64 syscopyarea crypto_simd sysfillrect glue_helper sysimgblt cryptd tg3 fb_sys_fops ahci usbhid ptp drm hid libahci pps_core i2c_piix4 [last unloaded: video]
[Wed May 24 23:59:14 2023] CPU: 35 PID: 19506 Comm: a.out Tainted: P           OE    4.15.0-112-generic #113-Ubuntu
[Wed May 24 23:59:14 2023] Hardware name: Supermicro Super Server/H12SSL-i, BIOS 2.0 02/22/2021
[Wed May 24 23:59:14 2023] RIP: 0010:fetch_pte.isra.7+0x185/0x190
[Wed May 24 23:59:14 2023] RSP: 0018:ffffa3e707ccb8b8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff11
[Wed May 24 23:59:14 2023] RAX: 0000000000000003 RBX: 0000000000001000 RCX: 0000000000000027
[Wed May 24 23:59:14 2023] RDX: 0000896b98821000 RSI: 0000008000000000 RDI: 0000000000000004
[Wed May 24 23:59:14 2023] RBP: ffffa3e707ccb8c0 R08: 0000000000000000 R09: ffff9665201a3890
[Wed May 24 23:59:14 2023] R10: ffffa3e707ccb8d0 R11: 000000000000001b R12: ffff966a6888f098
[Wed May 24 23:59:14 2023] R13: ffff966a6888f000 R14: 0000000000000000 R15: 0000896b98821000
[Wed May 24 23:59:14 2023] FS:  00007f819a75c000(0000) GS:ffff966a6e0c0000(0000) knlGS:0000000000000000
[Wed May 24 23:59:14 2023] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Wed May 24 23:59:14 2023] CR2: 00007f818d0dd1d8 CR3: 00000007b720a000 CR4: 0000000000340ee0
[Wed May 24 23:59:14 2023] Call Trace:
[Wed May 24 23:59:14 2023]  iommu_unmap_page+0x64/0x100
[Wed May 24 23:59:14 2023]  __unmap_single.isra.27+0x62/0x100
[Wed May 24 23:59:14 2023]  unmap_sg+0x5f/0x70
[Wed May 24 23:59:14 2023]  nv_unmap_dma_map_scatterlist+0x4d/0xa0 [nvidia]
[Wed May 24 23:59:14 2023]  nv_dma_unmap_pages+0x56/0x100 [nvidia]
[Wed May 24 23:59:14 2023]  nv_dma_unmap_alloc+0x34/0x50 [nvidia]
[Wed May 24 23:59:14 2023]  _nv038222rm+0xc5/0x1d0 [nvidia]
[Wed May 24 23:59:14 2023] WARNING: kernel stack frame pointer at 00000000cdf85af7 in a.out:19506 has bad value 000000003677a0b0

I found it is a amd linux iommu driver bug, as the follow patch shows, npages type is int which will overflow when npages << PAGE_SHIFT.

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
 index 97baf88d9505..189ecb206471 100644
 — a/drivers/iommu/amd_iommu.c
 +++ b/drivers/iommu/amd_iommu.c
 @@ -2453,7 +2453,7 @@ static int sg_num_pages(struct device *dev,
 {
 unsigned long mask, boundary_size;
 struct scatterlist *s;

 - int i, npages = 0;
 + unsgined long i, npages = 0;

mask = dma_get_seg_boundary(dev);
 boundary_size = mask + 1 ? ALIGN(mask + 1, PAGE_SIZE) >> PAGE_SHIFT :
 @@ -2505,7 +2505,7 @@ static int map_sg(struct device *dev, struct scatterlist *sglist,
/* Map all sg entries */
 for_each_sg(sglist, s, nelems, i) {

 - int j, pages = iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);
 + unsigned long j, pages = iommu_num_pages(sg_phys(s), s->length, PAGE_SIZE);

for (j = 0; j < pages; ++j) {
 unsigned long bus_addr, phys_addr;

so i try , the following method can reslove this bug.

  1. modifly linux 4.15 amd driver iommu source code, change int to unsigned long. then rebuild, reinstall.
  2. update linux kernel version to 5.8.
  3. disable iommu.

But we cannot udpate linux version at will, we must use ubuntu release kernel. So i am here for some help. I think this bug afftect the cuda api who call linux api dma_map_sg to map large size system memory. Is nvidia encounter this bug? we can avoid this bug in cuda toolkit?
btw, This bug not appear on other machine that has same and cpu and linux kernel version.

I find a similar question.
https://forums.developer.nvidia.com/t/kernel-call-trace-observed-when-calling-cudafreehost-cudahostalloc-for-buffers-on-amd-cpu-with-nvi/72930

another option then might be to use a newer ubuntu version which will come with a newer default kernel.

If you are stuck with an old operating system, and not allowed to make changes, then you are stuck with its bugs.

If you want to work around this in CUDA, I suppose you could limit the size of pinned memory requested so as to not trigger the bug.

Thank you, we will consider limit the size of pinned memory requested .

hi,
we bypassed this bug by set iommu=pt which not affect kvm.

echo GRUB_CMDLINE_LINUX=”amd_iommu=oniommu=pt” >> /etc/default/grub
sudo grub-mkconfig -o /boot/efi/EFI/ubuntu/grub.cfg
update-grub

reboot

thanks.