I used a usersapce PCIe device driver in order to communicate with my PCIe storage device bypass the OS kernel, it can reduce lost of SW overhead in kernel especially in block layer.
I used the VFIO container in order to get higher I/O performance and lower overhead, The VFIO device API includes ioctls for describing the device, the I/O regions and their read/write/mmap offsets on the device descriptor, as well as mechanisms for describing and registering interrupt notifications. I used these VFIO API (read/write/mmap) to communicate with my PCIe device.
The userspace code is working well in X86 architechture, and the preconditions in X86 are: 1) CPU support VT-d (Virtualization Technology for Directed I/O), 2) Enable VFIO and IOMMU in kernel config. Now I’m trying to make the userspace code support ARM architechture, and then I faced above issue.
In my userspace code, I’m not call domain free function directly, just in the VFIO create phase, I need send lots of IOCTL to VFIO, and during the VFIO_SET_IOMMU ioctl, the kernel Oops BUG 0 occurred. The detail functions call trace is as below:
VFIO create // in my userspace code
dev->contfd = open(“/dev/vfio/vfio”, O_RDWR) // SUCCESS
ioctl(dev->contfd, VFIO_GET_API_VERSION) // SUCCESS
ioctl(dev->contfd, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU) // SUCCESS
dev->groupfd = open(path, O_RDWR) // SUCCESS
ioctl(dev->groupfd, VFIO_GROUP_GET_STATUS, &group_status) // SUCCESS
ioctl(dev->groupfd, VFIO_GROUP_SET_CONTAINER, &dev->contfd) // SUCCESS
ioctl(dev->contfd, VFIO_SET_IOMMU, VFIO_TYPE1_IOMMU) // kernel Oops BUG 0 occurred, below is kernel functions call trace
----------------------------------// Above operations are in my userspace code ----------------------------------------------------
→ vfio_fops_unl_ioctl() // implement VFIO_SET_IOMMU ioctl
→ vfio_ioctl_set_iommu()
→ __vfio_container_attach_groups()
→ vfio_iommu_type1_attach_group() //kzalloc() for the domain success
→ iommu_attach_group()
→ iommu_group_do_attach_device()
→ arm_smmu_attach_dev() //in Arm-smmu.c “already attached to IOMMU domain” error occurred in this function and return -EEXIST
→ iommu_domain_free() // in Vfio_iommu_type1.c
→ arm_smmu_domain_free() //kfree() for the domain which kzalloc in vfio_iommu_type1_attach_group(), and kernel Oops BUG 0 occurred at the kfree() implementation.