"mlx5_vdpa: disagrees about version" Error After Upgrading MLNX_OFED Driver

After upgrading the MLNX_OFED Driver to version 5.8-2.0.3, the following error occurs during boot:

error log

Jul 10 05:41:46 Qacloudhost06 kernel: [2263067.316179] mlx5_vdpa: disagrees about version of symbol mlx5_db_free                                            
Jul 10 05:41:46 Qacloudhost06 kernel: [2263067.316185] mlx5_vdpa: Unknown symbol mlx5_db_free (err -22)                     
Jul 10 05:41:46 Qacloudhost06 kernel: [2263067.316212] mlx5_vdpa: disagrees about version of symbol mlx5_query_nic_vport_mtu                                
Jul 10 05:41:46 Qacloudhost06 kernel: [2263067.316213] mlx5_vdpa: Unknown symbol mlx5_query_nic_vport_mtu (err -22)                                         
Jul 10 05:41:46 Qacloudhost06 kernel: [2263067.316227] mlx5_vdpa: disagrees about version of symbol mlx5_create_auto_grouped_flow_table  

mlx5_vdap driver information

filename:       /lib/modules/5.15.0-60-generic/kernel/drivers/vdpa/mlx5/mlx5_vdpa.ko                                                                        
license:        Dual BSD/GPL                                                                                                                                
description:    Mellanox VDPA driver                                                                                                                        
author:         Eli Cohen <eli@mellanox.com>                                                                                                                
srcversion:     7E302F0D222DB0C740AEE6A                                                                                                                     
alias:          auxiliary:mlx5_core.vnet                                                                                                                    
depends:        mlx5_core,vhost_iotlb,vringh,vdpa                                                                                                           
retpoline:      Y                                                                                                                                           
intree:         Y                                                                                                                                           
name:           mlx5_vdpa                                                                                                                                   
vermagic:       5.15.0-60-generic SMP mod_unload modversions                                                                                                
sig_id:         PKCS#7                                                                                                                                      
signer:         Build time autogenerated kernel key                                                                                                         
sig_key:        0D:04:40:B9:A2:DE:02:2B:3C:CE:07:73:95:8B:8F:C1:58:B8:F5:D4                                                                                 
sig_hashalgo:   sha512                                                                                                                                      
...

Upon checking the version of the mlx5_vdpa driver, it seems to be using the default driver of the 5.15 kernel, and it appears to be a version conflict with the mlx5_core driver.

The mlx5_vdpa driver does not exist on the Nvidia driver download site, so I have set it to not load the mlx5_vdpa driver at boot using a blacklist.

If there is a better solution or method, please let me know.

Here are the details of my setup:
Hardware

  • Server: Dell R7615
  • CPU: AMD Epyc 9654P
  • Memory: 384GB
  • NUMA: 1
  • NIC: Connect-X 6LX

Software Versions

  • OS: Ubuntu 22.04.2 LTS
  • Kernel: 5.15
  • Openstack Version: Yoga
  • OVN: 22.03
  • OVS: 2.17.5
  • MLNX OFED Driver: 5.8-2.0.3
  • Firmware: 26.35.1012 (DEL0000000031)

Hello Kyoon,

Thank you for writing us.
I would suggest that you install the 5.8-3.0.7.0-LTS and try again.

Thanks and have a great day!
Ilan.

Hi ipavis

The ConnectX-6 LX currently in use is a Dell OEM product. The firmware versions supported by Dell are as follows:

Firmware Version Support Driver Version
26.36.1010 5.9-0.5.6.0 GA
26.35.1012 5.8-2.0.3.0 LTS

The 5.8-3.0.7.0 version you mentioned is only supported by Firmware versions 26.35.2000 ~ 26.35.3006, as confirmed in the documentation.

Would there be any issues if the configuration is as follows?

Driver Version Firmware Version
5.8-3.0.7.0 26.36.1010