Installing mlnx ofed driver in kata container

Hi,

My infiniband controller is Mellanox Technologies MT28908 Family [ConnectX-6] .

I have set up vfio for this device and passthrough the device to kata container. I am able to see the device by lspci, and I have installed the driver successfully. But, I can’t use the device, i.e can’t open the device.

Here are the issues I have met:

  1. In kata container, I am able to see the state of the device being “Active” and “Link up” by “ibstat”. However, when I typed ibv_devinfo, it returns the error “can’t open device”.

  2. In kata container, I can’t see the char device under the /dev directory . After I manually install cdev for inifiniband device by mknod. I can’t open those cdev.

  3. I can’t start opensm. The log shows there was error with open_umad_port(null:0). can’t open umad port.

I used the lastest mlnx ofed driver version: 24.07 and the kata container used kernel 5.15.0. I wonder if this is a configuration issue or a incompatible issue between kata and mlnx ofed driver.

Thank you for your time!
Sincerely,
Darren

Hello shaoyezheng,

Thank you for posting your inquiry to the NVIDIA Developer Forums.

Per the MLNX_OFED release notes, we do not support nor test implementation with Kata containers. As such, you will need to seek guidance for your solution via the Kata community.

We do offer support for implementation with Docker and K8s - please refer to this section of the MLNX_OFED documentation for more details on this:
https://docs.nvidia.com/networking/display/mlnxofedv24070610/docker+containers

Best regards,
NVIDIA Enterprise Experience

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.