How to use SR-IOV on Xen 4.2 paravirtualized guests?

Hi all,

I’m testing SR-IOV on Xen 4.2 with a ConnectX-3 MT27500 VPI adapter card. I have installed the mlnx ofed 2.2-1.0.1 with sr-iov support on Dom0 (Xen 4.2 over CentOS 6.4). The installation script has finished correctly and file mlx4_core.conf looks like this:

[root@hypervisor ~]# cat /etc/modprobe.d/mlx4_core.conf

options mlx4_core num_vfs=8 port_type_array=1,2 probe_vf=1 enable_64b_cqe_eqe=0

All seems to be ok because the lspci shows the eight virtual functions:

01:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]

01:00.1 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]

01:00.2 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]

01:00.3 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]

01:00.4 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]

01:00.5 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]

01:00.6 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]

01:00.7 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]

01:01.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]

In Dom0, I can use the PF and one VF:

[root@hypervisor ~]# ibstatus

Infiniband device ‘mlx4_0’ port 1 status:

default gid: fe80:0000:0000:0000:f452:1403:0006:9421

base lid: 0x7

sm lid: 0x1

state: 4: ACTIVE

phys state: 5: LinkUp

rate: 56 Gb/sec (4X FDR)

link_layer: InfiniBand

Infiniband device ‘mlx4_1’ port 1 status:

default gid: fe80:0000:0000:0000:0014:0500:0000:0088

base lid: 0x7

sm lid: 0x1

state: 4: ACTIVE

phys state: 5: LinkUp

rate: 56 Gb/sec (4X FDR)

link_layer: InfiniBand

I try to pass through a VF to a full virtualized guest and I can use the card in guest without problems.

Full virtualized guest detects the VF and it works correctly:

[root@VM1 ~]# /etc/init.d/openibd restart

Unloading HCA driver: [ OK ]

Loading HCA driver and Access Layer: [ OK ]

Setting up InfiniBand network interfaces:

Bringing up interface ib0: [ OK ]

Setting up service network . . . [ done ]

[root@VM1 ~]# lspci

00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)

00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]

00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]

00:01.2 USB controller: Intel Corporation 82371SB PIIX3 USB [Natoma/Triton II] (rev 01)

00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 01)

00:02.0 VGA compatible controller: Cirrus Logic GD 5446

00:03.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 01)

00:04.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]

00:05.0 Multimedia audio controller: Ensoniq ES1370 [AudioPCI]

[root@VM1 ~]# ibstatus

Infiniband device ‘mlx4_0’ port 1 status:

default gid: fe80:0000:0000:0000:0014:0500:0000:008d

base lid: 0x7

sm lid: 0x1

state: 4: ACTIVE

phys state: 5: LinkUp

rate: 56 Gb/sec (4X FDR)

link_layer: InfiniBand

However when I try to pass through a VF to a paravirtualized guest.

The card seems to be detected correctly:

[root@VM0 ~]# lspci

00:00.3 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]

But I cannot restart the openibd.

[root@VM0 ~]# /etc/init.d/openibd restart

Unloading HCA driver: [ OK ]

Terminado (killed)

Loading Mellanox MLX4 HCA driver: [FAILED]

^C

Please wait …

Loading Mellanox MLX4_IB HCA driver: [FAILED]

^C

Please wait …

Loading Mellanox MLX4_EN HCA driver: [FAILED]

Loading HCA driver and Access Layer: [FAILED]

And the dmesg command shows:

[root@VM0 ~]# dmesg

mlx4_core: Mellanox ConnectX core driver v1.1 (Oct 22 2014)

mlx4_core: Initializing 0000:00:00.3

mlx4_core 0000:00:00.3: enabling device (0000 → 0002)

mlx4_core 0000:00:00.3: Xen PCI mapped GSI0 to IRQ29

mlx4_core 0000:00:00.3: Detected virtual function - running in slave mode

mlx4_core 0000:00:00.3: Sending reset

mlx4_core 0000:00:00.3: Sending vhcr0

mlx4_core 0000:00:00.3: HCA minimum page size:512

mlx4_core 0000:00:00.3: Timestamping is not supported in slave mode.

BUG: unable to handle kernel paging request at ffffc9000032c00c

IP: [] msix_capability_init+0x29c/0x300

PGD 782ba067 PUD 782bb067 PMD 783be067 PTE 8010000000000464

Oops: 0003 [#1] SMP

Modules linked in: mlx4_core(O+) compat(O) ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 knem(O) coretemp hwmon crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel ablk_helper cryptd lrw gf128mul glue_helper aes_x86_64 microcode joydev xen_netfront pcspkr ext4 jbd2 mbcache xen_blkfront dm_mirror dm_region_hash dm_log dm_mod [last unloaded: compat]

CPU: 0 PID: 1609 Comm: modprobe Tainted: G O 3.10.55-11.el6.centos.alt.x86_64 #1

task: ffff880004fdc040 ti: ffff880004e1e000 task.ti: ffff880004e1e000

RIP: e030:[] [] msix_capability_init+0x29c/0x300

RSP: e02b:ffff880004e1fa28 EFLAGS: 00010286

RAX: ffffc9000032c00c RBX: ffff880074f14000 RCX: 0000000000000005

RDX: 0000000000000001 RSI: ffff880077ff0280 RDI: ffff880077ff0280

RBP: ffff880004e1fa78 R08: 00000000f3cd794f R09: 000000005448e430

R10: 00000000000a1dfa R11: 00000000000a1ca7 R12: 0000000000000000

R13: ffff880004d367c0 R14: 0000000000000000 R15: ffffc9000032c00c

FS: 00007f64e9ba4700(0000) GS:ffff88007f200000(0000) knlGS:0000000000000000

CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033

CR2: ffff80010ffb8010 CR3: 0000000005213000 CR4: 0000000000042660

DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000

DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Stack:

ffff880004cd2b80 ffff880074f14860 00000000000080d0 c06b880000002000

ffff880074f14000 ffff880004cd2b80 ffff880074f14000 ffff8800749b0000

ffff880004cd2b80 ffffffff81812328 ffff880004e1faa8 ffffffff812ebc3f

Call Trace:

[] pci_enable_msix+0x10f/0x160

[] mlx4_enable_msi_x+0x14d/0x200 [mlx4_core]

[] mlx4_load_one+0x9c4/0xdb0 [mlx4_core]

[] __mlx4_init_one+0x48c/0x590 [mlx4_core]

[] ? printk+0x4d/0x4f

[] mlx4_init_one+0x32/0x60 [mlx4_core]

[] local_pci_probe+0x4e/0x90

[] __pci_device_probe+0xd1/0xe0

[] ? pci_dev_get+0x22/0x30

[] pci_device_probe+0x3a/0x60

[] really_probe+0x7a/0x250

[] driver_probe_device+0x3e/0x60

[] __driver_attach+0xab/0xb0

[] ? driver_probe_device+0x60/0x60

[] ? driver_probe_device+0x60/0x60

[] bus_for_each_dev+0x94/0xb0

[] driver_attach+0x1e/0x20

[] bus_add_driver+0x1e8/0x250

[] ? mlx4_init+0xaa/0xaa [mlx4_core]

[] driver_register+0x74/0x160

[] ? mlx4_init+0xaa/0xaa [mlx4_core]

[] __pci_register_driver+0x4c/0x50

[] mlx4_init+0x7a/0xaa [mlx4_core]

[] __init_backport+0xe/0xcfe [mlx4_core]

[] do_one_initcall+0x42/0x170

[] do_init_module+0x80/0x1f0

[] load_module+0x3e8/0x5a0

[] ? mod_sysfs_teardown+0x150/0x150

[] ? copy_module_from_user+0x67/0xc0

[] ? module_sect_show+0x30/0x30

[] SyS_init_module+0x93/0xa0

[] system_call_fastpath+0x16/0x1b

Code: 41 83 c7 0c e8 56 ca df ff 4d 63 ff 4d 03 7d 20 41 8b 17 41 0f b7 45 02 41 89 55 08 83 ca 01 c1 e0 04 83 c0 0c 89 c0 49 03 45 20 <89> 10 41 89 55 08 49 8b 45 10 41 83 c6 01 48 39 45 b8 4c 8d 68

RIP [] msix_capability_init+0x29c/0x300

RSP

CR2: ffffc9000032c00c

—[ end trace 74e9d763ef2ef85b ]—

Any help is going to be appreciated.

Thanks in advance,

Javi

Hi Javi,

Thanks for the detailed explanation. a quick question. This might be related to a bug in the FW side with the VF causing the PF to crash when the first VF is reset (by the OFED script). is there a chance you are using FW 2.31.5050 ? please check with “ibstat” and let us know.