Hi all,
I’m testing SR-IOV on Xen 4.2 with a ConnectX-3 MT27500 VPI adapter card. I have installed the mlnx ofed 2.2-1.0.1 with sr-iov support on Dom0 (Xen 4.2 over CentOS 6.4). The installation script has finished correctly and file mlx4_core.conf looks like this:
[root@hypervisor ~]# cat /etc/modprobe.d/mlx4_core.conf
options mlx4_core num_vfs=8 port_type_array=1,2 probe_vf=1 enable_64b_cqe_eqe=0
All seems to be ok because the lspci shows the eight virtual functions:
01:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
01:00.1 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
01:00.2 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
01:00.3 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
01:00.4 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
01:00.5 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
01:00.6 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
01:00.7 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
01:01.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
In Dom0, I can use the PF and one VF:
[root@hypervisor ~]# ibstatus
Infiniband device ‘mlx4_0’ port 1 status:
default gid: fe80:0000:0000:0000:f452:1403:0006:9421
base lid: 0x7
sm lid: 0x1
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 56 Gb/sec (4X FDR)
link_layer: InfiniBand
Infiniband device ‘mlx4_1’ port 1 status:
default gid: fe80:0000:0000:0000:0014:0500:0000:0088
base lid: 0x7
sm lid: 0x1
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 56 Gb/sec (4X FDR)
link_layer: InfiniBand
I try to pass through a VF to a full virtualized guest and I can use the card in guest without problems.
Full virtualized guest detects the VF and it works correctly:
[root@VM1 ~]# /etc/init.d/openibd restart
Unloading HCA driver: [ OK ]
Loading HCA driver and Access Layer: [ OK ]
Setting up InfiniBand network interfaces:
Bringing up interface ib0: [ OK ]
Setting up service network . . . [ done ]
[root@VM1 ~]# lspci
00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
00:01.2 USB controller: Intel Corporation 82371SB PIIX3 USB [Natoma/Triton II] (rev 01)
00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 01)
00:02.0 VGA compatible controller: Cirrus Logic GD 5446
00:03.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 01)
00:04.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
00:05.0 Multimedia audio controller: Ensoniq ES1370 [AudioPCI]
[root@VM1 ~]# ibstatus
Infiniband device ‘mlx4_0’ port 1 status:
default gid: fe80:0000:0000:0000:0014:0500:0000:008d
base lid: 0x7
sm lid: 0x1
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 56 Gb/sec (4X FDR)
link_layer: InfiniBand
However when I try to pass through a VF to a paravirtualized guest.
The card seems to be detected correctly:
[root@VM0 ~]# lspci
00:00.3 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
But I cannot restart the openibd.
[root@VM0 ~]# /etc/init.d/openibd restart
Unloading HCA driver: [ OK ]
Terminado (killed)
Loading Mellanox MLX4 HCA driver: [FAILED]
^C
Please wait …
Loading Mellanox MLX4_IB HCA driver: [FAILED]
^C
Please wait …
Loading Mellanox MLX4_EN HCA driver: [FAILED]
Loading HCA driver and Access Layer: [FAILED]
And the dmesg command shows:
[root@VM0 ~]# dmesg
mlx4_core: Mellanox ConnectX core driver v1.1 (Oct 22 2014)
mlx4_core: Initializing 0000:00:00.3
mlx4_core 0000:00:00.3: enabling device (0000 → 0002)
mlx4_core 0000:00:00.3: Xen PCI mapped GSI0 to IRQ29
mlx4_core 0000:00:00.3: Detected virtual function - running in slave mode
mlx4_core 0000:00:00.3: Sending reset
mlx4_core 0000:00:00.3: Sending vhcr0
mlx4_core 0000:00:00.3: HCA minimum page size:512
mlx4_core 0000:00:00.3: Timestamping is not supported in slave mode.
BUG: unable to handle kernel paging request at ffffc9000032c00c
IP: [] msix_capability_init+0x29c/0x300
PGD 782ba067 PUD 782bb067 PMD 783be067 PTE 8010000000000464
Oops: 0003 [#1] SMP
Modules linked in: mlx4_core(O+) compat(O) ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 knem(O) coretemp hwmon crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 microcode joydev xen_netfront pcspkr ext4 jbd2 mbcache xen_blkfront dm_mirror dm_region_hash dm_log dm_mod [last unloaded: compat]
CPU: 0 PID: 1609 Comm: modprobe Tainted: G O 3.10.55-11.el6.centos.alt.x86_64 #1
task: ffff880004fdc040 ti: ffff880004e1e000 task.ti: ffff880004e1e000
RIP: e030:[] [] msix_capability_init+0x29c/0x300
RSP: e02b:ffff880004e1fa28 EFLAGS: 00010286
RAX: ffffc9000032c00c RBX: ffff880074f14000 RCX: 0000000000000005
RDX: 0000000000000001 RSI: ffff880077ff0280 RDI: ffff880077ff0280
RBP: ffff880004e1fa78 R08: 00000000f3cd794f R09: 000000005448e430
R10: 00000000000a1dfa R11: 00000000000a1ca7 R12: 0000000000000000
R13: ffff880004d367c0 R14: 0000000000000000 R15: ffffc9000032c00c
FS: 00007f64e9ba4700(0000) GS:ffff88007f200000(0000) knlGS:0000000000000000
CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff80010ffb8010 CR3: 0000000005213000 CR4: 0000000000042660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
ffff880004cd2b80 ffff880074f14860 00000000000080d0 c06b880000002000
ffff880074f14000 ffff880004cd2b80 ffff880074f14000 ffff8800749b0000
ffff880004cd2b80 ffffffff81812328 ffff880004e1faa8 ffffffff812ebc3f
Call Trace:
[] pci_enable_msix+0x10f/0x160
[] mlx4_enable_msi_x+0x14d/0x200 [mlx4_core]
[] mlx4_load_one+0x9c4/0xdb0 [mlx4_core]
[] __mlx4_init_one+0x48c/0x590 [mlx4_core]
[] ? printk+0x4d/0x4f
[] mlx4_init_one+0x32/0x60 [mlx4_core]
[] local_pci_probe+0x4e/0x90
[] __pci_device_probe+0xd1/0xe0
[] ? pci_dev_get+0x22/0x30
[] pci_device_probe+0x3a/0x60
[] really_probe+0x7a/0x250
[] driver_probe_device+0x3e/0x60
[] __driver_attach+0xab/0xb0
[] ? driver_probe_device+0x60/0x60
[] ? driver_probe_device+0x60/0x60
[] bus_for_each_dev+0x94/0xb0
[] driver_attach+0x1e/0x20
[] bus_add_driver+0x1e8/0x250
[] ? mlx4_init+0xaa/0xaa [mlx4_core]
[] driver_register+0x74/0x160
[] ? mlx4_init+0xaa/0xaa [mlx4_core]
[] __pci_register_driver+0x4c/0x50
[] mlx4_init+0x7a/0xaa [mlx4_core]
[] __init_backport+0xe/0xcfe [mlx4_core]
[] do_one_initcall+0x42/0x170
[] do_init_module+0x80/0x1f0
[] load_module+0x3e8/0x5a0
[] ? mod_sysfs_teardown+0x150/0x150
[] ? copy_module_from_user+0x67/0xc0
[] ? module_sect_show+0x30/0x30
[] SyS_init_module+0x93/0xa0
[] system_call_fastpath+0x16/0x1b
Code: 41 83 c7 0c e8 56 ca df ff 4d 63 ff 4d 03 7d 20 41 8b 17 41 0f b7 45 02 41 89 55 08 83 ca 01 c1 e0 04 83 c0 0c 89 c0 49 03 45 20 <89> 10 41 89 55 08 49 8b 45 10 41 83 c6 01 48 39 45 b8 4c 8d 68
RIP [] msix_capability_init+0x29c/0x300
RSP
CR2: ffffc9000032c00c
—[ end trace 74e9d763ef2ef85b ]—
Any help is going to be appreciated.
Thanks in advance,
Javi